Scikit-learn

SK-learn Logo
Scikit-learn (formerly known as scikits.learn) is an opensource machine learning library for Python. It features various preprocessing, classification, regression and clustering algorithms. What’s most interesting about this library is that almost all these algorithms are implemented using the same API. This API has become so familiar that even other machine learning projects such as Keras have to chosen to use the same API design. This chapter contains a collection of views and perspectives on Scikit-learn’s architecture.

The project was created by David Cournapeau as a Google Summer of Code project. The project has the SciKit, short for SciPy Toolkit because it builds on top of the SciPy library. In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort and Vincent Michel took over leadership of the project and made the first public release in that same year. Currently, the project has become one the of the most well known general-purpose open-source machine learning libraries.

About us

We are four master students from Delft University of Technology, with backgrounds in Computer Science.

  • Toon de Boer
  • Thomas Bos
  • Jordi Smit
  • Daniël van Gelder

Scikit-learn, what does it want to be?

tags: Scikit-learn Python Software Architecture Machine Learning Developers Roadmap Stakeholders

In this first blog post, we will examine the following aspects of scikit-learn. First, we will describe what scikit-learn is and what it is capable of doing. Then we will describe what stakeholders are involved in the project and finally, we will lay out a roadmap of future development of scikit-learn. This essay thus gives the necessary context for anyone how wants to study the architecture behind scikit-learn.

From Vision to Architecture

tags: Scikit-learn Python Software Architecture Machine Learning Developers Roadmap Stakeholders

Scikit-learn’s main goal is to make machine learning as simple to use for non-experts while remaining as efficient as possible. To do this scikit-learn has to hide all the complexities and variations between the different machine learning algorithms. In this blog post, we will explore how these requirements have resulted in the current architectural style and which trade-offs have been made to achieve it.

Scikit-learn’s plan to safeguard its quality

tags: Scikit-learn Python Software Architecture Machine Learning Developers Code Quality Testing Coverage Contributions

In our previous blog posts , we examined the vision and goals behind scikit-learn and discussed how these elements have been combined into the underlying software architecture. We considered several views and perspectives in which the software product operates and the trade-offs that were made to balance functional requirements with non-functional requirements.

How does scikit-learn balance usability with variability?

tags: Scikit-learn Python Software Architecture Machine Learning Developers Configurability Variability Usability Features

So far we have covered many aspects of scikit-learn from a software architectural perspective. In our first essay , we described the vision behind scikit-learn. In the second essay , we described how this vision translates to an architecture. Then, in our last essay , we investigated how scikit-learn safeguards its quality as an open-source system.