The 2nd Annual NYC WiMLDS / Scikit Sprint was held on Saturday, September 29 at Stack Exchange in New York City. This is our repository for all items related to the 2018 NYC WiMLDS Scikit Sprint.
- Thank you to Stack Exchange for hosting WiMLDS. They are also hiring! Check out their open positions.
- Andreas Mueller, core contributor of scikit-learn for facilitating this event and his team of TAs.
- WiMLDS Team
History of Scikit-Learn Python Library
In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort and Vincent Michel of INRIA took leadership of the project and made the first public release, February the 1st 2010. Since then, several releases have appeared following a ~3 month cycle, and a thriving international community has been leading the development.
This is how Andreas Mueller became involved in scikit-learn:
While working on my Ph.D. in computer vision and learning, the scikit-learn library became an essential part of my toolkit. I was an ardent user of the library, and I wanted to partake in its advancement. My initial participation in open source began in 2011 at the NIPS conference in Granada, Spain, where I had attended a scikit-learn sprint. The scikit-learn release manager at the time had to leave, and the project leads asked me to become release manager; that’s how it all got started.
Read more in this interview with Andreas Mueller.
Advertising the Sprint
A few weeks before the sprint, we still had 30 spots open. The below tweet reached so many scikit-learn users that it ensured that all spots were taken.
We even had two attendees who lived in Europe. They were visiting NYC during that time, and so joined the event on Saturday! I worked with one of them, Alice, who was visiting all the way from Paris. I was surprised and impressed that they chose to spend their Saturday in New York City at the sprint.
Andy gave away signed copies of his book Machine Learning with Python.
Thanks @amuellerml for giving copies of your book Introduction to Machine Learning with Python for attendees of the scikit-learn sprint today! #SciKitSprint #MachineLearning pic.twitter.com/Avdo39sQQZ— Theodora Hinkle (@acornthea) September 29, 2018
Testing Our Pull Request
After working on the issue and committing changes to our branch, we tested our pull request with this code:
It is always exciting to see (most of) our code tests passing.
Pull Requests Summary
It looks like 24 pull requests were submitted. Here’s a list of some of the pull requests.
Now I know why it’s called a sprint!
It was more challenging than I had anticipated. We spent most of the day going through the Python code base, trying to understand the maze of functions and how they are all related. We explored the Python files to determine where to place the
assert error for the issue we picked.
This weekend I contributed to an open source project @scikit_learn @WiMLDS_NYC. It was more challenging than I had expected but I learned so much! Check out @reshamas blog post to get an idea of what the day looked like. Thanks to @StackOverflow for the amazing space 😍 https://t.co/zcmN3imRnE— Melissa Ferrari (@Ferrari3Melissa) October 1, 2018
We Made It!
Saturday well spent co-organizing another great @WiMLDS_NYC event! Scikit-learn sprint to contribute to open source with @amuellerml! And he brought his awesome @OReillyMedia book for signing! Thank you @StackExchange for hosting us! #python #opensource #ScikitSprint #ml #WiMLDS pic.twitter.com/C5KFEm0kBz— Noemi Derzsy (@NoemiDerzsy) September 30, 2018
Looks like we still have some work to do on the pull request submitted by my sprint partner, Alice and me.
Participating in a Scikit Sprint
If you would like to participate in a scikit sprint, based on the history below, these look like the three best cities:
- Paris, France: where the scikit learn library was developed and where the majority of contributors reside
- New York, New York: where Andreas Mueller resides
- Austin, Texas: where the annual SciPy conference takes place
Listing of Scikit Sprints
There is a list on the scikit-learn wiki.
- SciPy: Austin (Jul)
- Euro SciPy, Cambridge (Aug)
- INRIA, Criteo, La paillasse, Tinyclues: Paris (Jul)
- Cloudera, SF (Feb)
- Paris (Jul)
- SciPy: Austin (Jun)
- EuroSciPy Bruxelles (Aug)
- PyCon France (Jul)
- SciPy: Austin (Jul)
- Paris (Sep)
- Paris (Jun)
- Paris (Mar)
- first release of scikit-learn (Feb)