MuseScore: Cost of music

The maintainability of a system can be measured in several different ways. One of these is technical debt, which is a measure used to indicate how much refactoring is necessary within a software project¹.

First, several processes to ensure software quality will be described. Furthermore, we will highlight some aspects of a code base that contribute to the technical debt. Then, the technical debt of the MuseScore software can be analysed to determine the cost of music.

Software quality processes

High software quality can be achieved in many different ways. We will now describe some processes that ensure a higher quality, and thus reduce the technical debt.

As MuseScore is a software project with a codebase of 280,000 lines of code and with close to 200 contributors², they need good measures in place to maintain reasonable code quality. In this section, we evaluate the processes in place to maintain the software quality of the MuseScore project. We identify the following four major code quality processes in the development process of MuseScore.

Coding rules

The use of coding rules to enforce code consistency across a software project results in higher code quality³. The MuseScore website presents an extensive list of coding rules; those include code style guidelines (indentation, notation, other coding practices) as well as submission rules. These rules and other code quality aspects will be checked and enforced by the next quality measure: peer reviews.

Peer reviews

Peer reviews are performed on code changes in submitted pull requests. This is not a regulated process, but in general, several senior/experienced MuseScore developers check the submission and ask questions or propose changes. Although everyone can submit peer reviews, only senior developers can actually merge new code. Another advantage of peer reviewing is that other developers can help identify potential bugs in the code.

Continuous integration processes

Software quality can be ensured by using continuous integration (CI) processes. These are development practices where developers frequently integrate code into a shared repository. Each integration can then be verified by an automated build and automated tests⁴. This way, new code is tested early and bugs can be discovered as they are introduced. While automated testing is not strictly part of CI, it is typically implied.

MuseScore provides continuous integration with AppVeyor for Windows and Travis-CI for Mac and Linux⁵. These services build the MuseScore application on a virtual machine and upload the result to osuosl. Additionally, Travis-CI runs the test suite as described in the following section.

Test processes

One of the most important tools for maintaining software quality is testing. Both the quantity and quality of the tests matter. For significant software projects like MuseScore, writing good tests can be a significant challenge. Therefore, it is imperative to have certain procedures and standards in place that guide contributors in writing and evaluating tests.

MuseScore has several guidelines for testing. These guidelines include details on how to create new tests effectively. On every build, these tests are run automatically by Travis-CI. Furthermore, for every code contribution it is required to create new tests or update existing tests (if necessary).

Releasing MuseScore

MuseScore uses three different release channels to ensure software quality in different stages of development. These are stable releases, beta releases, and nightly (development) builds. The stable releases contain versions of the software that have been extensively tested by the community. These versions are released to the general public about every nine months. The beta releases come in the months preceding the stable release. These versions may be unstable and are intended for testers and advanced users, to make the stable releases as bug-free as possible. The development builds are released multiple times per day. These builds happen automatically whenever a change is merged to the master branch, and serve as the first test for new functionality.

Code quality and maintainability

The maintainability of a program is the degree of effectiveness and efficiency with which a product or system can be modified by the intended maintainers⁶. In an open-source project, such a maintainer can be anyone. With a higher maintainability degree, it is easier to make changes to the code base. Thus, high maintainability should be a top priority for MuseScore.

We had a complete analysis of the codebase performed by the Software Improvement Group (SIG). They performed a number of calculations and analyses on the code base as a whole. This was based on a system decomposition, as mentioned in MuseScore: views on development. A fact sheet generated by SIG can be seen below.

System fact sheet generated by SIG. Retrieved from MuseScore Sigrid analysis.

The total maintainability score of MuseScore is 2.1 out of 5, where 5 is the best. SIG determines this score by comparing all SIG-evaluated projects to each other, for example, giving the 5% best projects a 5-star rating⁶. This means that MuseScore is in roughly the worst 10% of systems analysed by SIG.

Maintainability of the roadmap

In MuseScore: Road to reducing paper use in music industry, we argued that there is a lack of detailed future plans, but that MuseScore has four main features to focus on: accessibility, notation, playback and usability.

When looking at the accessibility and notation features, it is clear that these are both mainly UI-focused features. Improving the maintainability of these features would require working in the mscore package, which contains the controller classes for the UI of MuseScore². When looking at the SIG analysis, we see that mscore has a maintainability rating of 2.3, which is higher than the overall rating, but still rather low.

The other two features are playback and usability. These focus more on the technical side of MuseScore, but it is very likely that these also require some UI changes. However, most of the code will be changed in the libmscore package, which is the data model for MuseScore. According to SIG, this has a maintainability score of 1.8, which is below the average of MuseScore.

It should be noted that these two packages are biggest packages in MuseScore as shown in MuseScore: views on development. So, with these four features being the main focus points for MuseScore in the near future, roadmap-wise, it means that any change in these packages will require more effort than usual, thus refactoring will be costly.

Refactoring suggestions

A score of 2.1 stars out of 5 means that there should be plenty of refactoring suggestions. SIG has already put some effort into offering a few. An overview can be seen below.

Overview of the software metrics where most refactoring should be done to improve the maintainability score. Retrieved from MuseScore Sigrid analysis.

Complexity

Although there is some code duplication present, this is not too worrisome. The most important refactoring to be done is in unit complexity, which is based on the McCabe complexity⁶. This metric measures the number of paths that can be taken through a unit of code (usually a method)⁷. These paths are created by the use of loops or conditional statements.

The units of code that were suggested by SIG to improve the score and maintainability can be fixed by splitting methods. This means taking out a piece of complex functionality, which is then called from inside the original method. SIG has identified 100 candidates for complexity refactoring. On top of that, a lot of these candidates currently have a McCabe complexity of almost 100, while the maximum complexity as recommended by SIG is 25⁶. Refactoring these methods is therefore highly recommended.

Decoupling

Another metric which indicates a great need of refactoring, is the module coupling. This is defined as “the number of incoming dependencies from the modules of the source code”⁶. The coupling of modules poses a risk to the maintainability of a system, because changing a highly-coupled module also requires changing the dependencies, resulting in more work.

In the figure below, the module coupling problems are identified and rated based on the risk they pose to the maintainability of the code base. Dependency-wise, more than 20% of the code is high risk. High risk is defined as having more than 50 incoming dependencies⁶. As proposed by Michael Ridland (Developer, Consultant and Architect), decoupling can be performed by abstraction and events⁸. This entails using interfaces to define what code belongs together. Events are used to group similar activities together.

The MuseScore module coupling risk identification. Created with data from MuseScore Sigrid analysis.

Decoupling is definitely an important metric for MuseScore to focus refactoring actions on, as was also suggested by other contributors on the developer’s forum of MuseScore. User ‘shoogle’ has proposed to use modularizing to separate out dialects and imports/exports to improve the maintainability and make it easier to add new contributions⁹.

Coding hotspots

Coding hotspots are locations in the code with a lot of recent coding activity. These are relevant to the maintainability of a project as these might point to parts of code that have a lot of issues. For example, if a new bug is fixed in the same part of code every day, this might suggest there is something inherently wrong with that part of the code.

In terms of recent coding activity, we first define what is recent. MuseScore has pull requests emerging at a rate of several per day and they are merged daily¹⁰. So, we have chosen to focus on the last month, which is reasonable for a code base of around 280,000 lines of code¹¹. Furthermore, we define hotspots to have at least three commits in the last month.

With libmscore and mscore being the two largest packages in the code base, together making up for about 80% of the code base¹¹, these are the most interesting to look at as they contain all key architectural components. After studying the code base to look for hotspots, we decided that the libmscore hotspots were most relevant for the maintainability of the code base.

As libmscore only contains single files and no subpackages, we will identify coding hotspots here in terms of files. One of the main hotspots is the edit.cpp file, which was edited in 8 different commits in the last month. This is a file of roughly 2,000 lines of code, focused on editing scores. Most of the recent changes include bug fixes for warnings and crashes, although there were also changes made towards improving the user experience. For example, when switching instruments within a score, the features for certain specific instruments are no longer blindly used with the other instrument. Our own pull request for a bug fix was also located in edit.cpp, where we also identified two other bugs.

Another hotspot is measure.cpp, which was edited in five different commits in the last month. As a score is built up of measures, this is a key element for a score. The changes were all bug fixes to remove crashes, unnecessary warnings and issues with the PVS-Studio support. The latter being the static analysis tool that MuseScore uses for its code base, which is built-in for Visual Studio.

Technical debt

As explained in the introduction, technical debt is a measure of how much refactoring is required for a software project. We have analysed the software quality processes and maintainability of the MuseScore code. From this, we have learned that MuseScore does have processes in place to ensure high software quality. At the same time, from the analysis performed by SIG, we can see that the code base is not maintainable in the long term, which can make contributing more difficult.

When looking at the maintainability of the MuseScore project, a lot of improvements can be made, as per the refactoring suggestions. As argued by Türk, software quality is determined by a number of factors: functionality, reliability, usability, efficiency, maintainability and portability¹². Although maintainability is part of the software quality, these other factors are also at play. Therefore, we cannot draw any definite conclusions on MuseScore’s software quality.

Kruchten, P., Nord, R. L. & Ozkaya, I. Technical Debt: From Metaphor to Theory and Practice. IEEE Software, vol. 29, no. 6, pp. 18-21, Nov.-Dec. 2012. ↩
GitHub musescore/MuseScore. ↩ ↩²
Krishnan, M. S. & Kellner, M. I. Measuring process consistency: implications for reducing software defects. IEEE Transactions on Software Engineering, vol. 25, no. 6, pp. 800-815, Nov.-Dec. 1999. ↩
CodeShip. Continuous integration essentials. no date. (link). ↩
MuseScore. March, 2018. MuseScore Development Infrastructure ↩
Software Improvement Group. Sigrid manual. December 24, 2019. (Document) ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Harrison, W. & Magel, K. A complexity measure based on nesting level. ACM SIGPLAN Notices, Volume 16, Issue 3. March, 1981. ↩
Michael Ridland. Software Architecture: Increasing cohesion and decreasing coupling. February 24, 2014. (link). ↩
Shoogle. Modularizing MuseScore. January 20, 2019. (link). ↩
MuseScore Github. Pull requests. (link). ↩
Software Improvement Group. Sigrid Quality Assurance Platform ↩ ↩²
Tuna Türk. The effect of software design patterns on object-oriented software quality and maintainability. September 2009. Thesis for degree Master of Science for Electrical and Electronics Engineering at Middle East Technical University. ↩