Code is the basis for all blockchain technologies. With poor code, it might be even dangerous to use DLT systems. That is why we, the Institute for Crypto Asset Analysis, analyzed the Bitcoin codebase with an innovative and new method. We used specialized algorithms to determine the quality of the code and to detect errors. The codebase of Bitcoin showed similar results to the previously analyzed Litecoin and a higher quality than Ethereum. The code of Bitcoin is not without its flaws and improvements have to be made in the future. — Authors: Shourya Shirsha Nandi, Philipp Sandner, Christian Flasshoff
Institute for Crypto Asset Analytics (ICAA) analyzes the underlying codebase (e.g. C++, Java) of cryptocurrencies and other DLT solutions. We determine a score for the quality of the code, which is easy to understand and comparable. Further, we point out possibilities to improve the codebase but also potential risks. In the case of Bitcoin, we analyzed over 175,000 lines of code and identified issues or possible improvements in only 2.5% of the code. In most cases the issues were minor and not critical but there are also sections of the code, which require improvements by developers. ICAA aims to provide transparency, insights from reviewing the code of cryptocurrencies and to support investors, analysts and other blockchain enthusiasts with assessing crypto projects. Before going into the review of the Bitcoin code, let us understand the Bitcoin technology. For everyone who is familiar with the concept of Bitcoin, feel free to skip this section and continue reading in the next section.
The Bitcoin technology
Bitcoin was the first proof-of-concept for the blockchain technology. Bitcoin was developed as a peer-to-peer online payment system that solved the problem of double spending. The Bitcoin blockchain is a chain of digital signatures. The transaction between a payer and a payee occurs when the payer digitally signs the hash of the previous transaction and the public key of the payee. This information is then added to the end of the coin. Hashing the public key of the payee gives the address of the payee. Bitcoin has incorporated a time-stamp server to prevent double spending. When a bunch of transactions are bundled into a block, the time-stamp servers take the hash of the block of transactions, adds a timestamp to the hash and then publishes the hash widely. The timestamp in the hash proves the existence of the data at a certain time.
The Bitcoin transaction are validated by other members (miners) of the network with the help of consensus. Consensus in Bitcoin is achieved by using the proof-of-work system. The hash of a block contains a nonce. The nonce is a random or pseudo-random number used in cryptography for authentication purposes. A generated nonce can be used only once and cannot be reused. The hash when hashed with a SHA-256 algorithm, returns a value with a certain number of zero bits in the beginning. Bitcoin miners extend their processing power to implement this algorithm and in turn are compensated by receiving a certain number of bitcoins. The complexity of the puzzle for generating the number varies in such a way that irrespective of the number of miners or the system power in application, a block will be generated every ten minutes.
Review of the Bitcoin codebase
Bitcoin is equipped with a peer-to-peer payment feature which is defined in the Bitcoin codebase. The code is the backbone of the Bitcoin software, which runs on every node of the network. Therefore, the quality and the security of this codebase is crucial for the long-term success of the Bitcoin project. We used an innovative and new approach to analyze the quality of the code. With the help of a software, we automatically analyzed the C++ codebase of Bitcoin and determined the quality of the code. Since the Bitcoin code is published under the terms of the MIT license, the codebase is easy accessible on GitHub.
The specialized algorithms of the system automatically define a score for cryptocurrencies on a scale from -5 (worst score) to +5 (best score). This classification makes it easy to interpret the results and to compare it to other blockchain solutions. Technical remark: Our analysis is based on the Bitcoin code version from 2018–06–24 and draws comparisons to Litecoin code version from 2018–02–21 and Ethereum code version from 2018–03–12.
Analysis: Bitcoin scores a respectable 2.53 on a [-5, +5] scale
The major interest for this article is to assign a score to Bitcoin. The largest cryptocurrency scored 2.53, a respectable score with potential for further improvement. In the further course of this article we will analyze this number in more detail. The score can be further subdivided into four categories: design, metrics, duplications and code issues. For each category, an individual score is calculated, which helps to identify the origin of the issues. Figure 2 visualizes all the subcategories of Bitcoin version from 2018–06–24. In the following, each category is explained in detail and the results will be interpreted. This part might be a little bit more technical but still plausible for non-coders.
Design issues. In this category the codebase is analyzed in terms of the design of the code. A good code design is characterized by an easy to follow and efficient structure. This enables new developers to understand existing codes and apply changes without a lot of effort. Even though the functionality of software with well-designed code or less well-designed code might be the same, it is desirable to develop code that every programmer understands easily. As discussed in the previous reviews of Litecoin and Ethereum, the quality of code design can be analyzed automatically with help of algorithms and includes the detection of anti-patterns. Anti-patterns are sections of code that appears to work but are not optimal constructed. The patterns usually arise over time when new functionalities are added or when changing developers contribute to the code. Anti-patterns may result in errors and make maintenance of the code very difficult. There are several types of anti-patterns. In this analysis, we will consider the Global Breakable. The Global Breakable pattern is a structural anti pattern for a component of a system that often gets affected when other components are changed. Global breakable components are undesirable as they indicate fragility and a lack of modularity of the system. There are various anti-patterns but explaining them is beyond the scope of this article. The Bitcoin code has 297 issues of anti-pattern. 81 out of the 297 issues are ranked ‘high’ in terms of criticality. Most of the anti-pattern issues are in the ‘leveldb’ components (42) followed by ‘wallet’ components (36) and ‘qt’ components (31). Figure 3 shows the example of anti-pattern in the ‘leveldb’ component. The evaluation of the design issues gives Bitcoin a score of 2.69, which is better than Ethereum’s score of 2.29 but worse than Litecoin’s score of 3.08.
Metric violations. The next category tracks the quality of the code with software metrics. Such a metric is for example “number of methods” (NOM). This metric counts the total number of methods (functions) in one class. It is obvious that a higher number of methods makes the code more complex and increases the risk for errors. Other metrics are for example “lack of cohesion in methods” (LOCM), which measures the cohesiveness of a class or “access to foreign data” (ATFD), which measures the frequency of access to external attributes from other classes. To determine the quality of the code, the system we used reports when an undesirable threshold of a metric is exceeded and calculates a score for metric violations. The Bitcoin code shows a total number of 1,797 metric violations, which can be translated into a score of 1.28. Just like in the design issues, Bitcoin’s score of 1.28 is in-between Ethereum (0.47) and Litecoin (1.93).
Duplications of code. As the name already implies, the category duplications searches for duplicated code. Duplicated code is usually undesirable, since it may increase the lines of code, lowers the performance or increases software vulnerability. The Bitcoin codebase shows a desirable score of 3.91 and only 2.01% of the code is duplicated. Both Ethereum (4.13) and Litecoin (4.29) had better scores for duplication compared to Bitcoin.
Code Issues. The last category focuses on code issues. In contrast to design issues, code issues apply only to a local part of the code. Depending on the characteristics of the issue, the impacts on the performance of the software may vary. Therefore, it is important to classify the implications of the detected code issues. The innovative algorithm ranks each code issue between, low, medium, high and critical. Within the Bitcoin codebase, there are 426 codes issues. 86% of the code issues are classified as low or medium and many are in the “secp256k1” component. An example for a low code issue would be an unused label within the code. Such an unused label does not interfere with the correct functionality of the software but could be removed to make the code more compact. The remaining 14% of code issues fall into the category of high or critical issues. The critical issues are in the ‘wallet’, ‘net_processing.cpp’ and ‘arith_uint256.h’ components. The medium issues are in the ‘leveldb’ component. Figure 3 shows examples of critical code issues in the “arith_uint256.h” component. High and critical code issues do not necessarily lead to a dysfunction of the software but increase the risk of undesired behavior of the code. Software with less code issues tends to run more stable and should be the goal of every programmer. The occurrence of code issues is common in computer programming and is part of the development process. Nevertheless, the removal of code issues is necessary to improve the software. It is important to mention that the Bitcoin code receives only a score of 4.27 in the category of code issues. The score is much higher than both Ethereum and Litecoin where both received negative scores. The Ethereum and Litecoin scores were -0.05 and -0.07 respectively.
Summary: Bitcoin code needs to be improved
We, the Institute for Crypto Asset Analysis, automatically analyzed the Bitcoin code and the algorithms unveiled a decent code structure with potential for future improvement. The system classified the results into hotspots according to the urgency of the issue. As Figure 5 shows, almost 2% of the total Ethereum code is flagged as “critical” and needs further attention by developers. The problem can be narrowed down to the “wallet” component, where over 1,472 lines of code are affected. But also, other components contain hotspots with the classification “high”. Even though the code seems to function correctly, also these parts should be reviewed by developers. Since Bitcoin is open-source, different programmers can contribute to the project. This makes it difficult to develop a code with a consistent design and to avoid anti-patterns. The analysis reveals that the code quality of Bitcoin is far superior to Ethereum but falls just short of the quality of Litecoin. Obviously, there are also good news, over 58,000 lines of code have no issues and demonstrate high quality. The distribution of errors is displayed in the hotspot analysis of Figure 5. The figure shows that the majority of the critical issues are present in the ‘wallet’ component with ‘leveldb’ having the maximum number of issues.
In the future it will be interesting to monitor new developments of the Bitcoin code and to see whether the code will be improved in terms of code quality. In addition, a comparison of these results to other blockchain solutions are also necessary. The Institute for Crypto Asset Analysis will follow the development of Bitcoin and extend the analysis to other blockchain solutions in the future.
The results shown in this paper are based upon an automatic analysis of the code. Please note that this analysis does neither represent financial advice, nor is it supposed to be understood or interpreted as solicitation to buy or sell any securities, coins or tokens.
If you like this article, we would be happy if you forward it to your colleagues or share it on social networks. If you are an expert in the field and want to criticize or endorse the article or some of its parts, feel free to leave a private note here or contextually and we will respond or address.
Do you want to learn more about how blockchain will change our world?
- Blockchain knowledge: We wrote a Medium article on how to acquire the necessary blockchain knowledge within a workload of 10 working days.
- Our two blockchain books: We have edited two books on how blockchain will change our society (Amazon link) in general and the everything related to finance (Amazon link) in particular. Both books are available in print and for Kindle — currently in German and soon in English. The authors have been more than 20 well-known blockchain experts in startups, corporations and the government from Germany, Austria, Switzerland and Liechtenstein — all contributing their expertise to these two books.
Prof. Dr. Philipp Sandner has founded the Frankfurt School Blockchain Center (FSBC). In 2018 and in 2019, he was ranked as one of the “top 30” economists by the Frankfurter Allgemeine Zeitung (FAZ), a major newspaper in Germany. Further, he belonged to the “Top 40 under 40” — a ranking by the German business magazine Capital. Since 2017, he is member of the FinTech Council of the Federal Ministry of Finance in Germany. The expertise of Prof. Sandner includes blockchain technology in general, crypto assets such as Bitcoin and Ethereum, the digital programmable Euro, tokenization of assets and rights and digital identity. You can contact him via mail (firstname.lastname@example.org) via LinkedIn or follow him on Twitter (@philippsandner).
Christian Flasshoff is research fellow at the Frankfurt School Blockchain Center and Alumni of the Frankfurt School of Finance & Management. You can connect with him on LinkedIn (www.linkedin.com/in/christian-flasshoff) or contact him via mail (email@example.com).
Shourya Shirsha Nandi is a research fellow at the Frankfurt School Blockchain Center and an Alumni of the Frankfurt School of Finance & Management. You can contact him on LinkedIn (https://www.linkedin.com/in/shourya-shirsha-nandi-16aab383/).