From Vision to Architecture

In the previous essay, we focused on the vision of the Open edX system. In this essay, we will focus on the realization of that vision, visible in the architecture of the Open edX system.

This is done by taking a high level view of the architecture of the system. What architectural views are relevant? What main architectural patterns are visible in the system? From there, concrete overviews of the following viewpoints are given:

  • Functional
  • Information
  • Development
  • Operational
  • Deployment

Lastly, non-functional properties relevant to the architecture are reviewed.

Architectural Style

According to Rozanski & Woods, six core architectural viewpoints exist 1, namely: Functional, Information, Concurrency, Development, Deployment and Operational.

Relevant viewpoints

From these the functional, information, development, operational and deployment viewpoints are relevant to Open edX.

The functional view is relevant because it drives the definition of the other architectural views 1 (p. 215). It defines the architectural elements that deliver the system’s functionality. As such, its composition ultimately drives the value gained by the end-user.

The information view depicts how information is stored/manipulated/managed/distributed within the system. In a system such as Open edX, which delivers educational content to learners all over the world, this view is quite relevant.

Next, the development view needs to be considered as the Open edX system is open source, and meant to be extended/modified by other parties than edX themselves. This view describes architecture that supports this development process.

Furthermore, the operational view is considered as the Open edX system will need to be updated/administered, while maintaining consistency and availability for its end users.

Lastly, the deployment view is relevant to Open edX as it is deployed in many different forms and fashions. As discussed in our previous essay, many suppliers exist for Open edX. Each of them provides the Open edX system to schools and universities in different packages and formats. As such, the deployment view is important to consider.

Patterns

An overview of the Open edX architecture is given in the image below. From this we can see that the edx-platform codebase captures most of the important functionality of the system. This block is seen as one service, and is supported by the independently deployed applications (IDAs) to the right of it. This means that the edx-platform module can be seen as “a single huge object, with lots of small objects attached to it”, or, a “God element” 1. Which, according to Rozanski & Woods, is stated to be one of the common pitfalls encountered when designing functional viewpoints.

Luckily, the edX team has already acknowledged this and states that “Over time, edX plans to break out more of the existing edx-platform functions into new IDAs” 2. Which translates back to the principles of cohesion within modules and decoupling between. This speeds up the development process, as relevant responsibilities and functionality are grouped together, minimizing unnecessary coordination between them 3.

Open edX Architecture2

In all, the goal of edX seems to be this: To create a modular architecture aimed at reducing complexity, and increasing the ease with which developers can understand and contribute to its IDAs.

Development View

Open edX Modules

Module Organization

The core package for Open edX can be found in the openedx/ folder. The current intention of the developers is that all importable code from Open edX will eventually reside here. This includes the code from the lms, cms and common modules which currently lives in separate directories. The lms module houses the functionality of the learning management system while the cms takes care of the content management system.

Since the platform’s backend is written in Python, it heavily relies on Django external apps. These reside in the core/djangoapps directory. Furthermore, utilities that require Django are placed in the core/djangolib directory. Code that does not define Django modules or views of its own is found in the core/lib directory. Finally, the features module which mainly interacts with the common module contains other code which handles various functionality of the edX platform that is not related to either the learning nor the content management systems. On the project’s GitHub page there is a note regarding code that is not structured like this being treated as legacy code, which shows the developers are concerned with maintaining high quality standards for the system architecture.

From the top-level decomposition that we performed on the project’s codebase we could identify the interactions between the previously mentioned core components. This analysis shows that the core and common modules are the two most important since these two handle almost all the requests which come from the other modules namely the lms and cms.

Development Process

Installing and running an Open edX instance is not simple. This is why it is recommended to use the Open edX developer stack which is a Docker-based development environment. The steps for contributing to the Open edX project are clearly described in the contributing document as well as in the documentation pages of Open edx. After setting up the dev stack, one can get in contact with the other developers by joining the Jira server of the project or creating an account on the discuss.openedx.org website. Here, there are multiple threads on which developers can create new discussions and start conversations about the issues they are planning on fixing or features they want to add. On the Jira page, developers can also find an issue tracker with reported problems discovered by other developers and users. Finally, to make a contribution to the project, a developer needs to create a pull request on GitHub with a specific description in which details about the implemented change(s) are given. Once approved, the change gets added to the code base. The code will end up on the edX production servers in the next release, which usually which happens every week.

Standardization of Testing

The two main programming languages used in the Open edX platform are Python and JavaScript. A variety of tools are used for checking the codebase for any errors or vulnerabilities as well as enforcing a coding standard and coding style. To this end, developers are provided with a tool for running a check on the overall quality of their code by running the paver run_quality command in the root folder of the project. Moreover, a set of different tools is used depending on the programming language.

  • Python: the pep8 tool is used to follow the PEP-8 guidelines and pylint is used for static analysis and to discover trouble spots in source code
  • JavaScript: In order to standardize and enforce Open edX’s JavaScript coding style across multiple codebases, edX has published an ESLint configuration that provides an enforceable specification. EdX JavaScript style generally follows the Airbnb JavaScript Style Guide, with a few custom rules.

Codeline Organization

Open edX Code

The Open edX framework does not have a single standard for describing the overall code structure. The framework is mostly written in Python and JavaScript.

Python

Python is used for the platform’s backend together with the Django framework. Django is a high level Python web framework that encourages clean and rapid development with a pragmatic design. This framework also takes care of much of the hassle of web development, so the focus is placed on building the application rather than reinventing the wheel. Most of the Python code resides in the openedx/ folder which contains the core logic of the Open edX platform. Two other important modules are the lms (learning management system) and cms (content management system) which have separate folders under the root level of the project’s repository.

JavaScript

The JavaScript code for this project mostly resides in the common/js folder, with multiple sub-directories for the various components than handle user interaction within the platform.

Deployment View

This section provides an overview of the requirements and dependencies needed in order to successfully run the Open edX framework. The requirements for installing the Open edX software (from release Ficus onwards) are detailed on this confluence page. The server requirements are the following:

  • Ubuntu 16.04 amd64 (oraclejdk required). It may seem like other versions of Ubuntu will be fine, but they are not. Only 16.04 is known to work.
  • Minimum 8GB of memory
  • At least one 2.00GHz CPU or EC2 compute unit
  • Minimum 25GB of free disk, 50GB recommended for production servers

A note is also provided for hosting the platform on an Amazon AWS instance. For this, the recommendation is to use a t2.large instance with at least a 50Gb EBS volume for storage.

  1. Rozanski, N., & Woods, E. (2012). Software systems architecture: working with stakeholders using viewpoints and perspectives. Addison-Wesley.  2 3

  2. https://edx.readthedocs.io/projects/edx-developer-guide/en/latest/architecture.html  2

  3. Coplien, J. O., & Bjørnvig, G. (2011). Lean architecture: for agile software development. John Wiley & Sons. 

Open edX