Learn more about how Legit is helping enterprises prevent vulnerabilities in their SDLCs.
In 1905, economist Max O. Lorenz published a paper in the Publications of the American Statistical Association. In the paper, Lorenz outlined a “curve” that provided a novel way to represent income distribution and assess economic inequality. Seven years later, in 1912, Italian statistician Corrado Gini expanded upon Lorenz's work by developing the Gini Index, a numerical measure derived from the Lorenz Curve, which quantifies inequality on a scale from 0 (perfect equality) to 1 (perfect inequality).
What does this have to do with ASPM?
Recently, while working on a data analysis project at Legit as part of the development of our vulnerability prevention capabilities, we used this model to understand the distribution of security issues across different application assets (repositories, files, etc.). Specifically, we used it to identify assets that function as “hotspots” — those that have significantly more security issues compared to others in different cross-sections.
This method helped us to highlight whether certain “systems” — repositories with branch protection misconfigurations, files that contain vulnerabilities, or even issues that deploy to the cloud — are balanced or skewed, and if there are assets that are responsible for a great amount of risk, similar to how economists use it to show income concentration.
Understanding the Lorenz Curve and the Gini Index
Before we delve deeper into their applications in ASPM, let’s take a moment to understand what the Lorenz Curve and the Gini Index are. The Lorenz Curve is a graphical representation that shows the cumulative distribution of a resource — be it income, energy, or AppSec vulnerabilities — across a population or system. The curve starts at the origin (0,0) and moves to create a perfectly straight diagonal line that represents complete equality (everyone or every part has an equal share). The more the Lorenz Curve sags below this line, the greater the inequality within the distribution.
The Gini Index is a numerical summary of the Lorenz Curve. It measures the area between the line of perfect equality and the actual Lorenz Curve, divided by the total area under the line of perfect equality. This index ranges from 0 (perfect equality) to 1 (maximum inequality). In practical terms, the Gini Index gives you one quick number that encapsulates the distribution’s balance or imbalance.
Source: https://economicsfromthetopdown.com/2019/06/26/problems-with-measuring-inequality/
Understanding these tools in economics is one thing, but seeing them adapt to analyze any other systems, including SDLC vulnerabilities, is what makes their story truly fascinating.
Improving ASPM With Economic Models
Imagine, for example, the entities in your SCM — repositories, users, yml files, rows on rows of code, but also vulnerabilities, such as vulnerabilities uncovered by SCA and SAST, misconfigurations in branches, etc. Each combination of entity and vulnerability is a system — repositories with vulnerable dependencies, yaml files that contain CI/CD vulnerabilities, and any other combination you can think of.
These systems can become “toxic” when different types of security issues converge within one entity, for instance branch protection issues combined with exposed secrets. Further combinations could involve things like containers and vulnerabilities that deploy to cloud.
Managing these systems is managing SDLC security. But we first need to understand how, at the macro level, each system is built and how to fix its issues.
There are, of course, the standard macro details, such as the number of entities (users for example), the number of issues that belong to them (vulnerabilities that each user committed), the average number of issues per user, etc. When we review similar user systems with different kinds of issues, it’s helpful to understand if there are users that function as “hotspots” — and this is where the Gini Index comes into the picture.
Identifying these hotspots is hugely beneficial in application security. For instance, it can help you understand if a few simple actions can close many issues, check if a significant number of the issues belong to a specific team with bad practices, etc. Using the Gini Index together with a few other macro-data details helps to compare different systems and to identify systems where a small amount of users are responsible for a relatively large amount of issues.
The following graph simulates the use of the Gini Index to map branch protection misconfigurations in repositories belonging to three different dev teams.
As we can see, some repositories function as “hotspots” in all three teams, but in team 3, these hotspots are responsible for a significantly larger percentage of the branch protection misconfigurations, compared to the other teams; this data can create a variety of conclusions related to security, remediation processes, and more.
Stay One Step Ahead With Legit
This data-driven research project is a part of our continual effort to improve our ASPM capabilities in Legit, specifically our prevention capabilities. We are using research like this to combine proactive prevention insights, automated controls, and robust guardrails that enable teams to not just to find and fix vulnerabilities, but to prevent them from entering the codebase in the first place.
Learn more about our prevention capabilities.