There is more than meets the eye. Data is everywhere. There are patterns in every aspect of life but sometimes we cannot see them, we cannot understand them and even worse, we cannot retrieve any value from them! Bokeh is one of the many tools that aims to solve this problem: it connects the dots and lessens the distance between what we see and what we understand. In this essay we will look into Bokeh and what motivated its creation, development and vision!
Behind what you can plot
Data is everywhere! Everyone knows that. Nowadays society thrives and lives on data. It’s not surprising, then, that people try to use such data to reveal hidden patterns within. That is where data visualization comes into play: to lessen the distance between what we see and understand.
Python has been, as of late, the dominant programming language for data processing, as many libraries - such as Pandas, NumPy and ScyPy - have been optimized to work with large datasets, and users would like to have a visualisation tool on Python that can be easily connected to the processed data. Users usually have another, extra goal: to be able to share their findings in a way so that the visualization’s users can easily access and interact with it. Bokeh brings all of the previous requirements together, by providing interactive plotting on web browsers through Python in an easy and efficient way.
The hardest plot
Every software has the same end goal: to be used and understood. However, every user is different, and with this difference emerges an important point: distinct users may think about the software in distinct ways. With that in mind, in order to develop successful software, it is important to explore users’ perceptions around the system, anticipating their wants and needs.
When it comes to Bokeh, two types of users stand out: data enthusiasts and data specialists. While both types of end users expect Bokeh to act like a catalyst in plotting their data, we can draw the line between both groups when we talk about how they perceive and use the system. Data enthusiasts don’t use Bokeh professionally. Sometimes, they just want to play a bit with the data and share their incredible plots. These users value ease and celerity when using the system. On the other hand, data specialists may depend on it professionally. They require efficiency, productivity, complexity and completeness in order to retrieve the hidden value of the data and handle their custom use-cases.
This dissimilarity imposes a challenge on Bokeh: the system has to be viewed as easy to work with while having a powerful and complete interface. The real question comes up: Is Bokeh able to satisfy both mental models? In the next subsections we will look into that.
The reasons why you should plot with Bokeh
Does Bokeh generate value? Absolutely! Bokeh is an interactive visualisation library for modern web browsers, which can be described as: flexible, interactive, shareable, productive, powerful, and, last but not least, open source. These keywords define what Bokeh is and what it offers. Indeed, this library allows straightforward usage while giving the means to build cutting-edge specialised use-cases. This simple but important specification highlights the strong flexibility of the software: it is possible to start small, and eventually build up in order to use advanced tools and more complex and sophisticated features. It is clear that Bokeh aims to be for everyone: be a new user or an experienced one, the only requirement is knowing how to use Python. The fact that Bokeh is conceived as a Python framework means that the user can resort to PyData tools that he/she is familiar with, without the need of learning new tools and languages.
However, Bokeh not only offers value on what it does - enhancing the understanding of the data - but on how it does so. For example, data scientists and developers can leverage Bokehs’ capabilities to interact with the published results, to probe “what if” scenarios, to drill-down into the details of the data and also to visualize real-time data.
Software is made for and by people and these people can be categorized into different groups of stakeholders.
‘What is a “stakeholder”? Team members, system engineers, architects, and testers all have a stake in creating sound system form. There may be many more: use your imagination.’ 1
In their book 1 Coplien and Bjørnvig define stakeholders taking into consideration two distinct parts of the system: what it does and what it is. The first part, related to the structure of the business over time, has as principal stakeholders domain experts, system architects and business people. On the other hand the second part, that relates to the user’s view of the services provided by the system, has as principal stakeholders end-users, user-experience people, interface designers and requirements people. Furthermore, they divide the stakeholders in five major areas:
- the end users,
- the business,
- domain experts,
- and developers.
To further our understanding, we looked at a wide range of different sources. In addition, we go into deeper detail and divide each major area defined in 1 in different sub-areas, in an attempt to refine the analysis of Bokeh’s stakeholders. Such analysis can be found in the following tables.
|Previous sections showed that Bokeh is undoubtedly a library of powerful characteristics. Consequently, Bokeh is used in the industry as a tool to help different projects. Even Elon Musk tweets about it! In fact, not only Data Scientists, but also Engineers, Researchers and other end users depend on Bokeh for their day-to-day endeavours.
|The enthusiasts are the end users that do not use Bokeh professionally but rather in their free time activities mainly out of curiosity. Examples of this kind of end users are open source community members, students and overall enthusiasts.
|Bokeh is a sponsored project of NumFOCUS, a nonprofit charity in the United States. NumFOCUS provides Bokeh with fiscal, legal and administrative support to help to ensure the health and sustainability of the project (defined as an Assessor in 2). Furthermore, Bokeh’s donations are managed by NumFocus3. Bokeh is also sponsored by Anaconda, Nvidia, Quansight and REX Real Estate (defined as an Acquirers in 2).
|Bokeh’s core team offers the expertise needed to structure the project. The members of this team are responsible for the ongoing organizational maintenance and direction of Bokeh4. By March 2020 this team was compose by: Sarah Bird, Luke Canavan, Carolyn Hulsey, Mateusz Paprocki, Philipp Rudiger, Bryan Van de Ven.
Among the responsibilities of this team we have:
- Reviewing and merging Pull Requests from other contributors
- Making and implementing decisions about project infrastructure
- Protecting and managing confidential project information such as service passwords
- Handling all project financial matters
- Addressing reports related to the Community Code
|Developers are the prime oracles of technical feasibility1. In Bokeh, the development team comprises active contributors.
|Maintainers are members of the Development team whose job is to maintain and enforce.
|Here we include testers, designers, etc.
Stakeholder engagement is essential when it comes to achieve a coherent, well designed and functional system. In regard to developed-related communication, it is done mostly through Github and Zulip. In Github, developers interact through Issues and Pull Requests, whereas in Zulip developers interact in real time, discussing different topics related to the project.
End-user engagement is crucial as well. As Coplien and Bjørnvig say,
“[…] good end user engagement changes end user expectations.” 1
This communication is possible with the help of a discussion website, Discourse. Here users and other community members can showcase their interesting projects and awesome works, receive general support or even discuss Bokeh’s development. In addition, Twitter is also used as a platform for engagement, overall discussion and showcasing.
Variables and axis of Bokeh
As we know by now, data is taking over the world! And since progress does not stop, all data processing related tools will continue to grow even more. We talked about how Python, with the help of libraries such as NumPy and Pandas is one of the most used data processing tools, and we can only expect it to improve.
The following figure synthesis the context in which Bokeh lives.
Besides their strong participation in the data world, Bokeh still has room to grow. As many visualization libraries, it gets saturated for insanely large datasets, and that may be a potential area for improvement.
Bokeh and beyond
Progress should not be stopped. Bokeh is a software that is actually (March 2020) in continuous growth and development: just look at the pull requests!
This library allows you to see more and to see better. The only limits of what you can do with Bokeh are on the realm of human imagination: if you can imagine it, you can probably plot it.5
Moreover, Bokeh is open source, so you can help it grow even further and beyond. The future lies right ahead. The following image synthesizes Bokeh’s roadmap.
But the future is not only about software: people are an important variable in the equation, since software is made by and for them. In this regard, we can mention the willingness of improving this project’s documentation in an attempt to help the users finding the appropriate documentation and examples.
Additionally, some hints on the imminent future direction of the software can be found in the open issue list. Take, for example, this issue that jumps out, reporting a discussion about the possibility that the new version of Bokeh might stop supporting legacy web browsers in the future. For instance, this change can ultimately empower this library, while users stuck with obsolete web browsers will still be able to use previous versions of Bokeh.
Now that you can plot Bokeh, what will be your next graph?