Nairobi Wimlds 2019 Scikit Learn Sprint Impact Report

8 minute read

Sprint Background

This report focuses on the summary, impact and lessons learned of the Nairobi WiMLDS scikit-learn sprint. More background information is available in the following articles:

Sprint Participants’ Background

To ensure that attendees had some knowledge of Python and scikit-learn, a brief application form was used. Attendees did not have to be experienced Git users, but some experience was helpful.

Most attendees at these sprints were new to open source. The scikit-learn facilitator, Adrin Jalali, identified issues that were labeled “easy” or “good first issue” prior to the event so participants could review and become familiar with them. Prep work for the event included reviewing the Contributing documentation.

About 40 people attended the sprint. The group was evenly split by gender: 50% women and 50% men.

Impact Report for WiMLDS Scikit-learn Sprint

  Sprint 2019  
Report date 03-Aug-2019  
Sprint date 22-Jun-2019  
Location Nairobi, Kenya  
Open source library scikit-learn  
GitHub repository link nairobi-2019-scikit-sprint  
List of Issues project list  
     
Facilitator Adrin Jalali  
Organizer(s) Reshama Shaikh / Mariam Haji / Andreas Mueller  
Teaching Assistants Cate Gitau / Ben Mainye / Muoki Caleb / Isaac Mutie  
Sponsor: venue iHub  
Sponsor: food Microsoft 4Afrika  
Sponsor: books O’Reilly Media  
     
PULL REQUESTS (PRs)    
PRs [MRG] at sprint (a) 2  
PRs [MRG] post-sprint (w/o follow-up) (b) 15  
PRs [MRG] post-sprint (w/follow-up) (c) 2  
TOTAL PRs MERGED (d) 19  
     
PRs open (e) 0  
PRs closed (by merged PRs) (f) 7  
PRs returned to issue pool (g) 1  
     
Attendees: Initial Registrations 102  
Attendees: Participated ~ 40  
Attendee List 2019  
     
Event posting 2019 meetup event  
Sprint Application application form (closed)  
Post-sprint Feedback Survey survey form (closed)  
     
Cost of Sprint ~ $3500 (space + books not included; they were donated)  
     
Blog 1: by Mariam Haji Highlights From The 2019 Nairobi Wimlds Scikit Sprint  
Blog 2: by Adrin Jalali scikit-learn sprint at Nairobi, Kenya  
Blog 3: by Reshama Shaikh [Why Not Nairobi?] () (coming soon!)  
Twitter Moment WiMLDS Nairobi 2019 Scikit-Learn Sprint  

Notes

  • (a) this represents the number of pull requests which were merged during the sprint day.
  • (b) this represents the number of pull requests that were merged after the sprint by participants. These participants submitted their PRs by their own initiative.
  • (c) these PRs were merged post-sprint, and which were followed up by the sprint organizer.
  • (d) this represents the total PRs that were merged in which provides one dimension of the impact of the sprint.
  • (e) this number represents the number of PRs from the sprint which are still open.
  • (f) this number represents PRs that were opened, but then closed by the successful merge of another PR.
  • (g) this represents issues that were deemed advanced for beginner participants and returned to the issue pool.

Impact Summary for 2019

A total of 19 PRs were merged in:

  • 2 PRs were merged at the sprint
  • 15 PR was merged post-sprint without any follow-up
  • 2 PRs were merged with follow-up

All outstanding PRs from the sprint were merged in after 5 weeks, well before the 60 day suggested deadline.

One attendee traveled 8 hours just to attend the sprint.

Microsoft 4Afrika has been supportive of Nairobi WiMLDS and scikit-learn and would like to continue supporting the sprint events in the future.

Contributor Travel

Adrin Jalali is a core contributor to the scikit-learn library, and is based in Berlin, Germany. Because there are no core contributors nearby Nairobi or in Africa at all, it was necessary for someone to travel to Nairobi to facilitate the sprint. The flight was about 11 hours.

An International Collaboration

There were four of us who were involved in organizing the sprint, across three different countries (New York, NY, USA / Berlin, Germany / Nairobi, Kenya). It required lots of email communication, prompt replies and many video calls to ensure the event ran smoothly. It required a dedicated team of organizers which we were fortunate to have in order to execute a successful event so far away.

The Nairobi Sprint

I had a video call with Mariam Haji (Nairobi organizer) to discuss how the sprint went the Monday after the Saturday event. She shared that the attendees loved the event and wanted to have regular sprints. She also shared that the attendees stayed until the very end, at 6pm on Saturday. The attendees had a great time, as well as the organizers and contributor.

Non-measurable Impact

Aside from the number of PRs that were merged, there is non-quantifiable impact of the open source sprint. Some examples include:

  • learning to set up virtual environment
  • using Git (fork, clone, branch, fetching another’s PR)
  • introduction to tests such as: flake8 (linting, formatting), pytest, “continuous integration”
  • navigating through the codebase structure of scikit-learn
  • digging into functions, learning about errors
  • learning about unit tests
  • interacting with contributors on GitHub
  • learning, in general
  • networking
  • building confidence (making a dent in “imposter syndrome”)
  • having fun

Sprint Feedback Form

Mariam Haji, Nairobi WiMLDS organizer, created a Sprint Feedback Form. Of the approximately 40 attendees, 13 returned the form. This is a summary:

  • all respondents rated the sprint experience favorably
  • all respondents were highly likely to attend a future sprint
  • almost all respondents learned of the sprint via Meetup platform
  • all respondents indicated they would continue to contribute to scikit-learn

These were the respondents favorite parts of the sprint:

These are areas for improvement:

Here is additional feedback:

Adrin Jalali shares:

It’s so gratifying to see the joy in people’s face when they open their first PR and receive feedback from the core developers. I always love to see their faces when they start touching the files of the package, which they’ve used till then as a user. I still remember the feeling of sending a patch for an open source project for the first time (before git was invented), and when I see those same feelings in new contributors faces, it pushes all the exhaustion out of my being.

I’d say it was a very successful sprint, specially since quite a few of the contributors we had there, are still contributing and opening new PRs even afterwards. To me, that’s what a sprint for new contributors is for, to enable them and break all the barriers they feel having in front of them, and have them motivated enough that they continue contributing afterwards.

Lessons Learned

  • Need another core contributor at sprint: this would be very helpful as in New York City we often have 2 or 3 contributors. Travel costs restrict the participation of more contributors
  • Need one or more core contributors: to review PRs remotely during the sprint
  • Introduce an easy fix: at start of sprint
  • Create a tutorial: for Contributing, Git, and GitHub
  • Set up virtual environments prior to sprint day: because wifi was slow and delayed some attendees being able to effectively participate in sprint
  • Improve sklearn documentation: for setting up environments

    There were people who were still trying to setup their environment and get the package to compile still at the end of the day, and that to me is a strong hint that something could be improved.

Future Plans

  • 2 day event: Because the contributor has traveled far and PRs take some time to work through, it is worth considering having a 2 day sprint.
  • Remote participation: There were numerous requests to join remotely. Our plan is to have Andreas Mueller record the first couple of hours of the August 2019 sprint and share the recording.
  • Hold pre-sprint evening workshop: one week prior so participants can set up their virtual environments and save time during the sprint day.

Planning for 2020

This dinner is from the Friday night kick-off meeting, prior to the Saturday sprint. People were excitedly discussing plans for the 2020 sprint!

Sprint Expenses

The total cost of the sprint was approximately $3500 USD. It includes the cost of food which was covered by Microsoft 4Afrika. Other costs included shipping the books from the USA to Nairobi and travel for the contributor from Europe to Nairobi.

This cost does not include space which was provided by iHub and 2 dozen copies of books which were provided by O’Reilly Media.

Call for Action

We are looking for sponsorship for 2020 WiMLDS scikit-learn sprints in various countries.

Acknowledgments

Special thanks to the following reviewers for their speedy review, assistance and patience:

Sponsors


References


Addendum

  • [no addendums or updates at the time of publication]

Updated:

Leave a Comment