Contextual understanding of code using AI is the next significant opportunity to raise the bar on code quality and engineering productivity.
How often do we find ourselves searching around in code? It might be for fixing a bug, adding some functionality,or connecting to another part of the app. When we do find the code we think we are looking for however, how well do we understand the code?
Finding code is usually not an issue (though in large codebases it can be challenging). It is more the matter of figuring out what the code does and why certain decisions were made. Documentation almost never exists, so developers have to sift through code comments, git history,and bug reports to get context.
About code comments, they often have the opposite effect of providing context as this author states:
The developers are biased for being the authors, they don’t have the capability of judging their own work from an outsider perspective and that will lead to comments that other developers will find it hard to read.
As developers, we often neglect to write code that explains itself and is self-documenting. In other words, if you write clean,legible code, that should be enough for other developers to understand.
What code does not easily explain is “why”. That is like unwritten rules, bad processes, and ancient files. We see what itis, but we have lost the context, which manifests itself in the following ways:
· History – What is the history and authorship of the code,
· Value – What does the code do to support the business need,
· Impact – What dependencies exist in the code,
· Quality – What ensures the code functions well.
Code search utilities do help somewhat. A decent tool can read code and code artifacts such as commit messages, log files, tests,etc. Some work across repositories and different languages. Understanding code from a history, value, impact and quality perspective however requires something deeper.
Current software development trends are leading us to growing code complexity. The original Unix distribution was a mere few thousand lines of code. Browsers are several million lines of code and modern operating systems are in the tens of millions of lines of code. The software in cars is now over hundred million lines of code. And Google manages two billion lines of code!
A modestly complex product will have high code complexity. Whether a scaling startup with a hundred developers or an enterprise with tens of thousands of developers, all have to content with growing volumes and dependencies in the code. A single API change for example can ripple across multiple products, requiring careful review and change to the code and the corresponding tests.
Adding to code complexity is the rate of change in code. With the advent of automation and DevOps tooling, developers are committing code into production on an ongoing basis. Case in point is Google, where developers commit code over 20,000 times per day.
This means engineering teams now have to consider developer productivity and operational risk. It is not hard to see how a critical security protocol or patch could be missed or how a growing backlog of bugs and Jira tickets can slow down release cycles when developers cannot easily find answers to help them understand the codebase.
In speaking with engineering teams, the most common questions they ask among themselves are “where is X”, “who knows Y”, and “when Z changes, what happens”. Some answers can be searched for, but often deeper context is needed such as who is the expert or what dependencies exist.
AI can to used to tackle code complexity by providing more context. It can do this by augmenting for developers an understanding of the “why” behind the code and impact of code changes. The use of AI therefore can help to increase productivity and reduce security risks.
The way AI does this is by understanding business and technical semantics. Analysis of the codebase through AI created semantic tags. These tags are then analyzed to understand associations, further giving context around code. By applying these models to complex codebases, relevancy rates are improved as the model learns new concepts and patterns.
Contextual understanding of code is the next significant opportunity to raise the bar on code quality and engineering productivity. Human-based knowledge and community generated content are still helpful for developers. Increasingly the power and speed of AI is available to support developers in handling the growing complexity of code and the ability to manage and maintain large codebases.