Engineering AnalyticsCode complexity - a comprehensive guide

Naveen Kumar · 2021-11-26

Code complexity is a measure of how complex a piece of code is. The factors that influence the nature of code and its complexity are plenty. Here are two pieces of code to help demonstrate code complexity:

Example (a):

def bar():
x = 1
if x == 2:
print("Success")

Example (b):

def foo():
    evens = [2, 4, 6, 8, 10]
    odds = [1, 3, 5, 7, 9]
    for x in evens:
        for y in odds:
            product = x * y
            If product % 2 == 0:
                print “Product result is even”
            If product % 5 == 0:
    print “Product is divisible by 5”
            If product % 3 == 0:
                print “Product is divisible by 3”

Example (a) has a much simpler logic when compared to (b).

Quantifying the complexity of example (b) would have to take into account the number of iterations and the number of decisions to be made in each iteration. This means, even though (b) has only one decision statement, its complexity will be exponentially higher given its position under two loops.

Code complexity can be measured in a few different ways and hence it is to be taken into consideration with broader observations and personal experience. It is also important to understand the business critical reasons for measuring code complexity and the reasons behind higher values of the metric.

Why does code complexity matter?

Reading and code maintenance

High code complexity could imply that the code is harder to read and its functionality is harder to understand even with proper documentation in place. This has significant repercussions for engineers and managers - Readable code helps avoid repetition of similar methods thus encouraging code reusability. It is less prone to errors as it is easy to discern, and is faster to improve or fix any bugs.

Additionally, complex code that’s harder to read can directly impact the review times resulting in an increase in the overall cycle time. These issues therefore make it a matter of business concern to observe and optimize code complexity.

Traditional code review rants

Finally, complex code means it's harder for engineers to maintain in the long run. It demands more of the engineers’ time working on the system resulting in productive hours wasted on maintenance which could have been better used for other tasks.

Testing

All systems need to be tested thoroughly to help identify and fix issues before the customers get their hands on the product. Complex code tend to prolong the testing process by passing on the complexity of code over to the entire testing process, as it can:

  1. Increase the number of test cases - in white box methods like “basis path testing” or “structured testing”, the number of test cases will be equal to the linearly independent paths of the method. 
  2. Significantly heighten the chances of missing test cases given its hard to understand nature and because the number of test cases might be high
  3. Prolong the time taken by the test cases due to the nature of the complex module

Testing is considered the final seal of approval in many teams, making code complexity and its repercussions like missed test cases costly to businesses.

Time

Another direct effect of having systems with complex code is it can directly impact the compilation times of software which means an increase in compute resources being consumed, and a waste of productive hours for engineers.

Comic - Waste of productive hours

How does the complexity of code increase?

There are a few key actions that tend to have an impact on the complexity of a code base. Lack of documentation, architectural decisions, human nature, resource allocation, and evolving project requirements generally result in increased complexity and hence it is best to first understand how these make their impact on complexity.

Lack of documentation

Documentation in code carries tremendous importance. It is with documentation within code that engineers can work together to build a product - it helps engineers to not step on each others toes, not repeat or do additional work and helps put down the thought process behind the method or the next n lines of code which can give insight to others while debugging, adding new features or while tackling a problem of similar nature.

When this all important documentation is lacking, or in extreme cases, missing, it is natural that engineers end up doing some, or all of those tasks documentation was supposed to prevent - adding to the complexity.

Architectural decisions

Architectural decisions dictate the way the software is written, how it will be improved, how it can be tested against and so on. These decisions tend to be made at the beginning of the project or at key moments when large changes are required. This means that these decisions have enormous potential to influence how complex the system and the modules in it will turn out to be.

Over optimization

Over-engineering is common. It's the process by which engineers tend to over optimize the code even when it's not warranted. Such over optimizations lead to hard to read code and low reusability, both of which end up adding to the complexity of code. This is an excellent reason why clear and detailed guidelines should be added to every project and checked with every pull request.

Comic - Over optimizing is not always good

Poor resource allocation

A common reason why code complexity can increase is improper allocation of tasks. When engineers that lack the specialization of a particular skill set are assigned tasks that require extensive experience, the complexity of code is bound to increase. Not using language specific constructs, paradigms or patterns due to lack of experience with the language or a particular programming framework or technology is a commonly overlooked cause that impacts code complexity considerably.

Evolving project

One of the earmarks of an evolving project, other than code churn, is the increase in its complexity. When a code base is worked upon constantly, to fix bugs, to add features, to extend, each edit has the potential to increase the complexity of not just the system but the existing code. These can quickly compound, leading to higher complexity scores.

How is code complexity measured?

Cyclomatic complexity or McGabe Complexity

The most popular way to measure code complexity is Cyclomatic complexity develop by Thomas J. McCabe, Sr. in 1976. In fact, it comes in-built in many code editors like VSCode, linters like jslinter, flake8 and IDEs like IntelliJ. Cyclomatic Complexity or McGabe complexity, named after its creator, is a measure of the linearly independent paths in a section/module. McGabe suggests the cyclomatic complexity be less than 10 for most cases, meaning a score above 10 is enough cause to refactor the code.

Code complexity - Scale of 1-10-infinity

To calculate the cyclomatic complexity of a program, its best to draw the control-flow graph and use the formula:

M = E − N + P

Where, 

M = Cyclomatic Complexity,

E = Number of edges,

N = Number of nodes, and

P = Number of connected components

Halstead volume

Halstead volume is part of a set of software metrics introduced by Maurice Howard Halstead in 1977. Similar to other halstead complexity measures, Halstead volume takes into account the number of operators and operands and aims to describe the size of the implementation of the module or the algorithm. It is represented by the following formula:

V = N * log2(n)

Where, 

V = Halstead volume,

N = Program length,

And n = Program Vocabulary.

While,

Program Length = N = N1 + N2 = total number of operators + total number of operands

Program vocabulary = n = n1 + n2 = number of operators + number of operands

Maintainability Index

Maintainability Index or MI is a score of how easy it is to maintain code. It is a combination of the four metrics: Cyclomatic complexity and Halstead volume, Lines of Code (LoC) and depth of inheritance. This is considered to be a metric that helps give an overall picture of complexity as it weighs Halstead volume and cyclomatic complexity against LoC and depth of inheritance. The traditional formula is defined as follows:

Maintainability Index = 171 - 5.2 * ln(Halstead Volume) - 0.23 * (Cyclomatic Complexity) - 16.2 * ln(Lines of Code)

Since the above formula results in a range of [-∞, 171], a slightly modified formula is used to bind the range of MI to [0, 100]:

Maintainability Index = MAX(0,(171 - 5.2 * ln(Halstead Volume) - 0.23 * (Cyclomatic Complexity) - 16.2 * ln(Lines of Code))*100 / 171)

Ideally, an MI of less than 10 is considered good, while 10 to 19 is acceptable, and scores >20 are considered high priority for rework.

What to do about code complexity?

Code complexity is an indicator of code quality and its ease of maintenance. Complexity directly impacts delivery times and the quality of products shipped by a team. Since products are usually maintained over a long time period, teams should strive to optimize complexity in order to allow long-term ease of use, readability, and maintenance of code.

Optimizing for complexity involves a careful study of the current patterns and setting of baselines and acceptable standards based on the observed patterns and industry standards. Metrics such as cycle time, throughput, review practice, focus time, communication patterns, etc., help engineering leaders measure and optimize code quality.

Hatica equips engineering leaders with granular visibility into dev team productivity alongside 60-plus other metrics enabling them to reduce bottlenecks to accelerated delivery and achieve a better developer experience. Learn how Hatica can help you build better engineering teams.

Request a demo → 

Subscribe to Hatica's blog

Get bi-weekly emails with the latest from Hatica's blog

Share this article:
Table of Contents
  • Why does code complexity matter?
  • How does the complexity of code increase?
  • How is code complexity measured?
  • What to do about code complexity?