Plotting Bokeh, an Analysis of its Architectural Variables

In order to architect, one must envision. In the last essay we discussed the vision underlying our project. Now we go into further detail in our architectural analysis of Bokeh. We shed some light on the different architectural views that describe Bokeh; we address design choices and patterns applied to this project; and finally we look into its non-functional properties.

A bird’s eye view of Bokeh’s architecture

It is fair to assume that while using Bokeh one may think “wow, what an amazing graph I just plotted!”. Indeed, Bokeh allows for the creative usage of data with the end goal of promoting insightful thoughts when one looks at it. However, there are also other perspectives from which you can look at Bokeh. Different stakeholders involved in this project share different viewpoints towards the system, which must be considered. To do that we resort to the 4+1 view model of architecture1.

The logical view is one of the most relevant views of the system since it relates to the functionality provided to the end-users. As we discussed in the first essay, Bokeh is a visualization library that refers to a broad spectrum of users, offering different functionalities with the goal to be for everyone, from new users to experienced ones.

According with the definition of Kruchten, the process view takes into account some non-functional requirements. Moreover, it addresses concurrency and distribution, system integrity, and fault tolerance, focusing on the run time behavior of the system1. From Bokeh’s perspective, the run-time view and non-functional requirements are essential… and do not worry! we will address these later, your curiosity shall be satisfied!

The development view describes the system from the viewpoint of software developers and testers. Thus, this view is concerned with the architecture that supports the development process, in order to

ensure that there is order rather than chaos when it comes to the organization of the system’s code.2

The fact that Bokeh is an open source project with more than 92.8k lines of both Python and Javascript code organized in a wide structure (check codeline organization figure) shows that the analysis of this view should not be underestimated, as you will see later in the essay.

The fourth viewpoint is the physical view. It concerns aspects of the system that are important after it has been tested and is ready to be deployed2. This view also describes the mapping of the software on the hardware, which in Bokeh’s case is not so relevant, due to the fact Bokeh and BokehJS work on top of abstractions, like operative systems and/or browsers. Nevertheless, this view can be exploited to highlight some interesting aspects such as technology compatibility and the use of third-party software as we will explain in the dedicated section.

Now, what about the +1? This refers to a few selected use cases, or scenarios, which are used to check if the other 4 views work in harmony. This viewpoint is relevant for Bokeh as it is for any other software, since it helps with the validation and illustration of software design.1

Although not considered by Kruchten, there is an extension to the logical view that can be relevant: in a recent AMA given by Grady Booch, the growing interest in data as a central point in a system’s design was highlighted. As data becomes more and more important in our society, system’s architectural elements should reflect the need for exchanging, understanding and representing it. In the context of our project, it makes even more sense since Bokeh lives on data.

Patterns in Bokeh’s architecture

“The purpose of a software pattern is to share a proven, widely applicable solution to a particular design problem in a standard form that allows it to be easily reused.”2

Bokeh is no different from other widely used software systems. It also follows some architectural patterns. Bokeh’s architecture facilitates an easy manipulation of the components and configuration of a plot from server-side code in Python or other languages. Furthermore, BokehJS can also be used directly as a standalone JavaScript library, with plot data embedded directly into the page, retrieved via AJAX calls, or supplied by a separate Bokeh Plot server3.

We can easily identify the client/server design pattern:

Bokeh Server-Browser interaction

A Bokeh server (left) uses Application code to create Bokeh Documents. Every new connection from a browser (right) results in the server creating a new document, just for that session. The capability to synchronize between Python and the browser is the main purpose of this server.

The design pattern that allows this synchronicity is called observer pattern. We can think of Bokeh Server as the observer that is watching for all the sessions, here named the subjects. Bokeh can work as a simple and straightforward Python library that can display the plots or output them to an .html file.

But there is more. Bokeh uses delegation extensively for policies. The delegation pattern is a object-oriented design pattern that allows an object to handle a request by delegating it to another object.4

The Window class delegates the area computation to the rectangle.

In addition, Bokeh follows the visitor design pattern for processing object graphs5. In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure on which it operates. The algorithm is converted to a class of its own and visits the object structure where it is executed.

If you're wondering what is an object graph, there you go.

Bokeh follows other patterns as well, such as the websocket protocol being implemented via explicit state transitions and proxies in various places, yet explaining them goes beyond the scope of this blog post.

Developing Bokeh

Now, lets look at the files that define Bokeh.

Codeline organization

Developers encapsulate source files with related code in modules, which are studied in detail along with their dependencies. This way developers know how to work on some parts of the code without affecting in an unintended way the remaining, safeguarding code maintainability and extensibility.

Bokeh's most important modules and their dependencies.

Note that, due to the sheer amount of dependencies and modules, not all of them could be visualized in readable manner, so we chose to only include the most important ones in the figure above.

Starting from the top, we have the django (purple) module (here we use django as a representation, since Bokeh allows for the usage of other web frameworks such as Flask). This module is highly dependent on a lot of different modules. This happens because it is possible to embed a Bokeh application into another web application, which, in its turn, highly depends on code that is partitioned in different modules.

The models module (pink) provides the building block classes of Bokeh, the Models6: graphs, plots, axes, ranges, scales, widgets, etc. These low-level objects comprise a Bokeh object graph.

Next, we have core (blue), which is not truly a module but rather a package that represents an aggregation of modules used to implement Bokeh itself7. For example, the validation module can be used to perform integrity checks on an entire collection of Bokeh Models8, while the property.mixins module provides the classes used to group properties of the Models.9

The util module (orange) provides a collection of utilities used to implement Bokeh’s functionalities10, be that functions to call browsers; JS compilation; dependency checking; etc. It is interesting to note that almost all of the most important modules depend on util.

The modules sphinx, sphinxext and examples (green) are relevant for documentation purposes.1112

Bokeh (red) can be considered the top level module in the project. It contains useful functions and attributes, such as the license and current version of the project.13 Note here that several modules have dependencies towards bokeh and it has dependencies towards them. This implies that the modules depend on each other.

The modules colored in gray are responsible for the client-server application. For example, the document module is a container for Bokeh Models to be reflected on the client side BokehJS library (yellow)5; while the protocol module provides message protocols for communication between Bokeh Servers and clients.14

On a general note, this graph allow us to conclude that there is a tight coupling between Bokeh’s components. There are also several cases of component entanglement and circular dependencies (util to bokeh to models to util).

When building the development view, it is also important to define system-wide standards to ensure technical integrity.2 Bokeh does that in all sorts of ways. For example, the core team developed guidelines to manage issues and Pull Requests on Github.

Running Bokeh

One may still ask, how does everything really connect and work? Is Bokeh really a well-oiled machine?

Bokeh server can be served from a multitude of frameworks, be it Django, Flask or the default Tornado Framework. To serve Bokeh as a server all that you have to do is bokeh serve {yourdocument}.py, and a Tornado web server will start serving your document to the interwebs or just to your local network. Here, we have one for you.

This interaction between Bokeh and BokehJS is one of the things that make this open-source project so interesting.

First, the application code computes the Document, creating the object graph. Next, this graph is serialized and sent to BokehJS where it is deserialized and rendered. The application code is executed in the Bokeh server every time a new connection is made. The application code also sets up any callbacks that should be run whenever properties such as widget values are changed.

This allows scalability, concurrency and distribution. In a perfect world, one server can serve it all!

Deploying Bokeh

As Bokeh is a Python library, its execution only depends on the end-user running a compatible version of Python (3.6+). There is no cloud-based runtime dependencies but it does have system dependencies in order to correctly run locally15. For the latest version (2.0.0) we can identify the following Python libraries as dependencies for basic usage as well as optional dependencies needed in order to use all features available, and testing modules:

Bokeh's dependency graph

One may tend to ignore the importance of testing here. However, in a recent interview we did with the core team member Bryan Van de Ven, he actually identified testing as one of Bokeh’s production main challenges:

“For me personally one of the hardest areas is simply the enormous test surface for a project like Bokeh. If you look at the entire matrix of Python version x OS version x Browser version x Jupyter version x Tornado version x… It’s just vastly more than we have resources to run real tests for. Things continue to improve as we are able to, but it’s an ongoing challenge.”

As of continuous integration, until the latest release Bokeh made use of GitHub-CI to run a full test build in every pull Request to master, but for version 2.0.0 a feature was added to include Subresource integrity, and that made developers build and upload part of deployment manually.

Not only about plotting

Bokeh is not just a data visualization library, but also a very good one. One way of measuring such claim is to look into its non-functional properties. We have identified the following areas in which Bokeh excels:

Feature Description
Open Source & Transparency As an open source system, Bokeh gets users involved in the development process, which allows the team to know where to center their efforts after listening to the end-user’s demands directly. Also, it allows the amount of contributors to rise substantially. Open Source bring along great transparency and communication, and Bokeh really embraces the open source spirit by allowing anyone to enter their zulip channel, where there is constant interaction between contributors and the core team members.
Performance Bokeh also has a very good performance in comparison to other standard Python data visualization libraries. In the first section of this notebook, we have made a simple, yet very conclusive test about how much better the performance is. What we did was plot 3 simple functions using both MatPlotLib and Bokeh, and time the processing and rendering time of each library. The results state that Bokeh is ~3x faster to process and show data, making it pretty safe to say that, at least in simple cases, it has a better performance.
Interactivity In Part 2 of our notebook, you will realize that adding interactive tools to a plot is very simple. In this example we added basic mouse navigation events such as panning and zooming, and a save button to download your plot.
Readability Plotting with Bokeh is very easy: variables and functions have very intuitive names, as well as methods and keyword arguments of functions. A good example of this is that in the previously mentioned notebook, to create a figure, you must call the figure function (Duh!), and if you want to add a label to, lets say, the X axis you only have to pass the keyword argument x_axis_label="<label>" (Duh!^2). And guess what keyword argument you must give to the figure in order to add tools to it! Yup, you guess it right! Tools. That is it. Bokeh can not get more straightforward than this!
Documentation The development team has gone the extra mile to document every aspect of Bokeh, even using Sphinx to generate HTML documentation for easy export. Interestingly, with the recent release of Bokeh 2.0, there has been an increasing difficulty in communicating all the changes brought to the library. Bryan Van de Ven addressed this issue on the interview, commenting on the fact that community support is an important complement to documentation: “We have tried to document those changes […] but there is no way to make sure users see or know about it, and I expect there will be an increased load of support questions for a long time.” It is clear then, that there is an interesting trade-off between the size and maturity of a project and the need for documentation, as Bryan, also underlined: “once a project reaches a certainly level of size/maturity, all the important and hard problems are not technical ones, they are people/community ones.”
Security Bokeh added on their latest release Subresource Integrity, which is “a security feature that enables browsers to verify that resources they fetch (for example, from a CDN) are delivered without unexpected manipulation”16 making all of their resources more secure, however, there is a trade-off involved. As we interviewed Bryan Van de Ven, he brought up the fact that this integration made the release of Bokeh a little harder, or less automatic, as now developers would need to build and test locally before releasing.

That’s it! Hopefully you have enjoyed this insight in how the architecture of Bokeh is plotted and how its various components are drawn together.

  1. P. B. Kruchten, “The 4+1 View Model of architecture,” in IEEE Software, vol. 12, no. 6, pp. 42-50, Nov. 1995.  2 3

  2. Rozanski, Nick, and Eóin Woods. Software systems architecture: working with stakeholders using viewpoints and perspectives. Addison-Wesley, 2012.  2 3 4

  3. https://github.com/bokeh/bokeh/blob/master/bokehjs/README.md 

  4. https://en.wikipedia.org/wiki/Delegation_pattern 

  5. https://docs.bokeh.org/en/latest/docs/reference/document.html  2

  6. https://docs.bokeh.org/en/latest/docs/dev_guide/models.html 

  7. https://docs.bokeh.org/en/latest/docs/reference/core.html 

  8. https://docs.bokeh.org/en/latest/docs/reference/core/validation.html#bokeh-core-validation 

  9. https://docs.bokeh.org/en/2.0.0/docs/reference/core/property_mixins.html#module-bokeh.core.property_mixins 

  10. https://docs.bokeh.org/en/latest/docs/reference/util.html 

  11. https://docs.bokeh.org/en/latest/docs/dev_guide/documentation.html 

  12. https://docs.bokeh.org/en/latest/docs/reference/sphinxext.html 

  13. https://docs.bokeh.org/en/latest/docs/reference/0_bokeh.html 

  14. https://docs.bokeh.org/en/latest/docs/reference/protocol.html#module-bokeh.protocol 

  15. https://docs.bokeh.org/en/latest/docs/installation.html 

  16. https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity 

Bokeh
Authors
Alfonso Irarrázaval
Andrea Monguzzi
Guilherme Fonseca
Miguel Cardoso