This report focuses on the summary, impact and lessons learned of the Nairobi WiMLDS scikit-learn sprint. More background information is available in the following articles:
- About the WiMLDS scikit-learn sprints
- WiMLDS Open Source Sprint Events (website)
- Sprint Preparation
- Highlights from the 2018 WiMLDS NYC Scikit Sprint
- Impact Report for 2017-2018 NYC WiMLDS scikit-learn Sprints
- Interview with Andreas Mueller
- List of Scikit-learn Sprints
Sprint Participants’ Background
To ensure that attendees had some knowledge of Python and scikit-learn, a brief application form was used. Attendees did not have to be experienced Git users, but some experience was helpful.
Most attendees at these sprints were new to open source. The scikit-learn facilitator, Adrin Jalali, identified issues that were labeled “easy” or “good first issue” prior to the event so participants could review and become familiar with them. Prep work for the event included reviewing the Contributing documentation.
About 40 people attended the sprint. The group was evenly split by gender: 50% women and 50% men.
Impact Report for WiMLDS Scikit-learn Sprint
|Open source library||scikit-learn|
|GitHub repository link||nairobi-2019-scikit-sprint|
|List of Issues||project list|
|Organizer(s)||Reshama Shaikh / Mariam Haji / Andreas Mueller|
|Teaching Assistants||Cate Gitau / Ben Mainye / Muoki Caleb / Isaac Mutie|
|Sponsor: food||Microsoft 4Afrika|
|Sponsor: books||O’Reilly Media|
|PULL REQUESTS (PRs)|
|PRs [MRG] at sprint (a)||2|
|PRs [MRG] post-sprint (w/o follow-up) (b)||15|
|PRs [MRG] post-sprint (w/follow-up) (c)||2|
|TOTAL PRs MERGED (d)||19|
|PRs open (e)||0|
|PRs closed (by merged PRs) (f)||7|
|PRs returned to issue pool (g)||1|
|Attendees: Initial Registrations||102|
|Attendees: Participated||~ 40|
|Event posting||2019 meetup event|
|Sprint Application||application form (closed)|
|Post-sprint Feedback Survey||survey form (closed)|
|Cost of Sprint||~ $3500 (space + books not included; they were donated)|
|Blog 1: by Mariam Haji||Highlights From The 2019 Nairobi Wimlds Scikit Sprint|
|Blog 2: by Adrin Jalali||scikit-learn sprint at Nairobi, Kenya|
|Blog 3: by Reshama Shaikh||[Why Not Nairobi?] () (coming soon!)|
|Twitter Moment||WiMLDS Nairobi 2019 Scikit-Learn Sprint|
- (a) this represents the number of pull requests which were merged during the sprint day.
- (b) this represents the number of pull requests that were merged after the sprint by participants. These participants submitted their PRs by their own initiative.
- (c) these PRs were merged post-sprint, and which were followed up by the sprint organizer.
- (d) this represents the total PRs that were merged in which provides one dimension of the impact of the sprint.
- (e) this number represents the number of PRs from the sprint which are still open.
- (f) this number represents PRs that were opened, but then closed by the successful merge of another PR.
- (g) this represents issues that were deemed advanced for beginner participants and returned to the issue pool.
Impact Summary for 2019
A total of 19 PRs were merged in:
- 2 PRs were merged at the sprint
- 15 PR was merged post-sprint without any follow-up
- 2 PRs were merged with follow-up
All outstanding PRs from the sprint were merged in after 5 weeks, well before the 60 day suggested deadline.
One attendee traveled 8 hours just to attend the sprint.
Microsoft 4Afrika has been supportive of Nairobi WiMLDS and scikit-learn and would like to continue supporting the sprint events in the future.
Adrin Jalali is a core contributor to the scikit-learn library, and is based in Berlin, Germany. Because there are no core contributors nearby Nairobi or in Africa at all, it was necessary for someone to travel to Nairobi to facilitate the sprint. The flight was about 11 hours.
An International Collaboration
There were four of us who were involved in organizing the sprint, across three different countries (New York, NY, USA / Berlin, Germany / Nairobi, Kenya). It required lots of email communication, prompt replies and many video calls to ensure the event ran smoothly. It required a dedicated team of organizers which we were fortunate to have in order to execute a successful event so far away.
The Nairobi Sprint
I had a video call with Mariam Haji (Nairobi organizer) to discuss how the sprint went the Monday after the Saturday event. She shared that the attendees loved the event and wanted to have regular sprints. She also shared that the attendees stayed until the very end, at 6pm on Saturday. The attendees had a great time, as well as the organizers and contributor.
Aside from the number of PRs that were merged, there is non-quantifiable impact of the open source sprint. Some examples include:
- learning to set up virtual environment
- using Git (fork, clone, branch, fetching another’s PR)
- introduction to tests such as: flake8 (linting, formatting), pytest, “continuous integration”
- navigating through the codebase structure of scikit-learn
- digging into functions, learning about errors
- learning about unit tests
- interacting with contributors on GitHub
- learning, in general
- building confidence (making a dent in “imposter syndrome”)
- having fun
Mariam Haji, Nairobi WiMLDS organizer, created a Sprint Feedback Form. Of the approximately 40 attendees, 13 returned the form. This is a summary:
- all respondents rated the sprint experience favorably
- all respondents were highly likely to attend a future sprint
- almost all respondents learned of the sprint via Meetup platform
- all respondents indicated they would continue to contribute to scikit-learn
These were the respondents favorite parts of the sprint:
These are areas for improvement:
Here is additional feedback:
Adrin Jalali shares:
It’s so gratifying to see the joy in people’s face when they open their first PR and receive feedback from the core developers. I always love to see their faces when they start touching the files of the package, which they’ve used till then as a user. I still remember the feeling of sending a patch for an open source project for the first time (before git was invented), and when I see those same feelings in new contributors faces, it pushes all the exhaustion out of my being.
I’d say it was a very successful sprint, specially since quite a few of the contributors we had there, are still contributing and opening new PRs even afterwards. To me, that’s what a sprint for new contributors is for, to enable them and break all the barriers they feel having in front of them, and have them motivated enough that they continue contributing afterwards.
- Need another core contributor at sprint: this would be very helpful as in New York City we often have 2 or 3 contributors. Travel costs restrict the participation of more contributors
- Need one or more core contributors: to review PRs remotely during the sprint
- Introduce an easy fix: at start of sprint
- Create a tutorial: for Contributing, Git, and GitHub
- Set up virtual environments prior to sprint day: because wifi was slow and delayed some attendees being able to effectively participate in sprint
sklearndocumentation: for setting up environments
There were people who were still trying to setup their environment and get the package to compile still at the end of the day, and that to me is a strong hint that something could be improved.
- 2 day event: Because the contributor has traveled far and PRs take some time to work through, it is worth considering having a 2 day sprint.
- Remote participation: There were numerous requests to join remotely. Our plan is to have Andreas Mueller record the first couple of hours of the August 2019 sprint and share the recording.
- Hold pre-sprint evening workshop: one week prior so participants can set up their virtual environments and save time during the sprint day.
software to record voice + screen:— Reshama Shaikh (@reshamas) July 31, 2019
OBS (Open Broadcaster Software): https://t.co/pCPuTyPmeL
• Free and #opensource software for video recording & live streaming
• Download and start streaming quickly and easily on Windows, Mac or Linux
via @amuellerml pic.twitter.com/GWKMmCzGSq
Planning for 2020
This dinner is from the Friday night kick-off meeting, prior to the Saturday sprint. People were excitedly discussing plans for the 2020 sprint!
This is last night's (Friday) pre-sprint dinner kick-off meeting. @scikit_learn core contributor @adrinjalali flew from Berlin to Nairobi to run this sprint. 👍🏽— Reshama Shaikh (@reshamas) June 22, 2019
The crew started planning the @WiMLDS_Nairobi 2020 sprint already! 😁@adrinjalali brought stickers too!@NumFOCUS pic.twitter.com/Z9I5dRmwVM
The total cost of the sprint was approximately $3500 USD. It includes the cost of food which was covered by Microsoft 4Afrika. Other costs included shipping the books from the USA to Nairobi and travel for the contributor from Europe to Nairobi.
This cost does not include space which was provided by iHub and 2 dozen copies of books which were provided by O’Reilly Media.
Call for Action
We are looking for sponsorship for 2020 WiMLDS scikit-learn sprints in various countries.
Special thanks to the following reviewers for their speedy review, assistance and patience:
.@Microsoft Pumps $100M into African AI Centres— Reshama Shaikh (@reshamas) June 23, 2019
• sites to open: Nairobi, Kenya & Lagos, Nigeria in 2019
• goal: 100 developers for AI, ML, mixed reality (500 by 2023)
• Africa has the biggest potential for revenue growth
• MSF 1st mover in regionhttps://t.co/WLuT7tk75n
- [no addendums or updates at the time of publication]