The Wimlds Scikit Learn Open Source Sprints
About WiMLDS
Women in Machine Learning and Data Science (WiMLDS) is a non-profit 501(c)(3) based in New York, New York.
The mission of WiMLDS is to support and promote women and gender minorities who are practicing, studying or are interested in the fields of machine learning and data science. WiMLDS currently has 72 chapters in 31 countries with 37000+ members (statistics updated 11-Aug-2019). The 2019 Q1 Report was released on April 1, 2019.
About Scikit-learn
Scikit-learn is the machine learning library for Python. It is open source which means all of its code is available on GitHub as scikit-learn. The statistics show:
- Stars on GitHub: 35000+
- Forks on GitHub: 17000+
- Used by these may other repositories on GitHub: 55000+
- Users: hundreds of thousands
History of Scikit-Learn Python Library
This project was started in 2007 as a Google Summer of Code project by David Cournapeau. Later that year, Matthieu Brucher started work on this project as part of his thesis.
In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort and Vincent Michel of INRIA took leadership of the project and made the first public release, February 1, 2010. Since then, several releases have appeared following a ~3 month cycle, and a thriving international community has been leading the development.
Background Statistics
This table shows the statistics of women in tech and other sub-groups.
Open Source Background
A 2013 study found that only 11% of open source contributors were women. A 2016 gender-inferred analysis examining the top 100 contributors for various programming languages found that just 2% of contributors to Python libraries on GitHub were women. For Scikit-learn, the machine learning library for Python, there is only 1 woman in the top 100 contributors to this library. To address this gender imbalance for the scikit-learn library, Andreas Mueller, core contributor, initiated organizing the first open source sprint in New York City with the local chapter of WiMLDS in early 2017. The first sprint was held in March 2017 and the second one was held in September of 2018. There is a report summarizes the impact of the two events: Impact Report for WiMLDS Scikit Sprint
2019 Open Source Sprints
For the 2019 year, WiMLDS in collaboration with Andreas Mueller, is organizing 3 more scikit-learn open source sprints at these locations:
- Nairobi, Kenya (June 2019)
- New York, New York (August 2019)
- San Francisco, California (November 2019)
Nairobi, Kenya is the 4th largest meetup chapter with 2000+ members, and that is one reason this chapter was considered for an open source sprint. Also, past sprints for the scikit-learn library have been held only in the United States and Europe. The goal of this sprint is to extend the accessibility of open source events outside of these two regions.
Key References
- WiMLDS Open Source Sprint website
- WiMLDS Scikit-learn Sprint Impact Report for 2017 and 2018
- Interview with Andreas Mueller
About Reshama Shaikh
Reshama Shaikh is a Board Member of WiMLDS and Big Apple Py. Both these organizations support and promote women in the Python community and open source. She has organized the 2017 and 2018 sprints, and is the main point of contact for organizing the three sprints in 2019.
Leave a Comment