Customers that have embraced DevOps often ask me for the best metrics to measure their program. I always advocate focusing on policy compliance as the number one metric for understanding your risk, as this provides a succinct measurement of the security of your applications.
However, if you are looking to measure and motivate development teams, policy compliance doesn’t give you the granularity to introduce gamification or incentives. Policy compliance is very black and white; you either are compliant (good!) or you are not (bad!). So, when talking to customers about motivating teams in the spirit of continuous improvement, I like to bring up Mean Time to Resolve (MTTR).
I’ve also seen this as “Mean Time to Repair,” “Mean Time to Recovery,” or “Mean Time to Respond.” I personally like “resolve” as it indicates that the security finding has been closed, which is aligned with how we compute this metric.
You often see MTTR in association with DevOps and the tenant of making work visible and measurable – and thus unprovable. This is why I bring it up with our users; however long it takes you to resolve a security finding will help organizations make program improvements that move the needle on the overall metric of policy compliance.
How MTTR is calculated at Veracode
The standard definition for MTTR is along the lines of the following: corrective maintenance time / total number of corrective maintenance actions.
When it came time to implement MTTR in our new analytics feature, we initially interpreted this as finding first found date – finding closed date for each finding divided by total number of findings. Sounds good at face value, but when it comes to Veracode’s security findings, implementing this exact calculation gets a bit tricky.
Since customers are primarily using Static Analysis as part of their development pipelines we focus on static findings to ensure the calculation makes sense, though this is applicable to Dynamic Analysis and open-source findings as well.
For Static Analysis findings, each finding can be open and closed many times depending on the code that is scanned. If a developer scans a piece of code multiple times but with subtle changes, we can assert that findings are closed intentionally and fixed. If a developer scans code and forgets a module or only scans a small part of the code that they are working on, we don’t see the findings that were previously found and thus those are marked as closed as well since they are not present.
If the developer then scans the whole application again, those findings are reopened. This happens regularly through the development cycle. Measuring the activity of open and closed doesn’t seem relevant in this context. However, once you are focused on a release candidate, this sort of measurement takes on new importance.
Additionally, if you are looking at an application that is in production, this measurement is even more important, as this is where we are effectively seeing the time that an organization takes to respond to and resolve security findings.
Delivering a meaningful MTTR
How do we provide MTTR that is meaningful? In Veracode Analytics, we focus on the most recent time a finding was found, and the most recent time that the finding was closed. We always look at the policy context for calculating MTTR. While this can be calculated on a per-sandbox context basis, attempting to calculate MTTR across all sandboxes leads to very bizarre data due to flaw matching.
If a flaw is open in Sandbox 1 but closed in Sandbox 3 because it wasn’t present and mitigated in Sandbox 17, what is the current state of that flaw? Does the most recent scan, regardless of sandbox or policy, represent the “current” state, or does it just represent a scan that was performed? This is why limiting to the policy context is important, since there is a level of control for the scans performed at the policy level.
“Resolved” means both fixed (also known as “remediated” or no longer present in the scan) as well as mitigated, where someone has documented a compensating control for the finding and that control has been approved. This means that if a finding has an associated approved mitigation, the most recent time it was found could also be the exact same time it was resolved since the mitigation will immediately close the finding.
The final nuance to MTTR is to compare the speed of addressing policy-impacting findings vs general security debt. Veracode's policy is regularly - and should be - used as a sieve to ensure clear communication with development teams on what is important and what needs to be fixed, as opposed to what is simply additional information. If the policy is used correctly, you should see that policy-impacting findings are resolved at a faster rate than all other findings. If this isn’t the case, then the policy isn’t being used by the dev time to prioritize work.
The ‘average’ approach to MTTR
MTTR is by nature a calculation; despite its name, we are actually performing an average.
“Days to Resolve” is a dimension on a finding. This data is only populated if the finding is in a closed state. A finding is a flaw-matching flaw that Veracode has seen over many scans—incidentally, this is why we separate out “Scan Explore” from “Findings Explore” in the Analytics feature, as Scans are a point in time while findings are over time.
When we look at MTTR, we are inherently looking at a group of findings and their “Days to Resolve” dimensions, then taking an average of the total time to resolve divided by the number of findings.
Measuring Time to Resolve for your organization
A customer recently asked me why he saw a different MTTR for his entire organization than when he found the average MTTR for his three business units.
For each application, you have N number of findings in a closed state with a Time to Resolve. When we look at the measure Mean Time to Resolve, we are actually providing the average Time to Resolve for the dimension selected. So, when you look at a single application and see “Days to Resolve,” you are actually seeing the average across N.
Therefore:
Average time to resolve = ( Σ xi ) / n = (Sum of Time to resolve for each finding) / Number of findings
The sum is asking why (A= ( Σ xi ) / n1)+ (B = ( Σ xi ) / n2)+ (C= ( Σ xi ) / n3)+ ≠ (Z =( Σ xi ) / nA) where n1, n2, and n3 are each business unit (BU), and nA is all three BU’s together.
When you look at this mathematically and think about the order of operations, you will realize that you always complete the Σ xi before you divide by the number of findings. This means that each BU (A, B, and C) may have drastically different numbers than all of the BU’s together (Z) because you are taking the division step before you take the addition step—which is mathematically incorrect.
In short, you are providing equal weighting where there should not be equal weighting.
Here is an example:
BU A contains 2 closed flaws that took 1 day to close. The MTTR for BU A = (1+1)/2 = 1
BU B contains 200 closed flaws that took 20 days to close. The MTTR for BU B = (200x20)/20 =200
If we then add those and divide by two, we do not get the MTTR for all flaws across the two BUs. Instead, we get a number that is meaningless because of the significant weight that is given to the two flaws of BU A. If we want the MTTR for all flaws across the two BUs, then we must add all flaws in Time to Resolve together and then divide by total number of flaws.
Using MTTR the right way
Across our customer base, we see a wide variety in MTTR. A lot of this is tied to the type of application and its criticality to the organization. If you have an internal-facing legacy system, an average time to resolve for that application of 30 days may be great. If you have an external application that handles your PII, five days may be too long for your average time to resolve.
Metrics and KPIs provide information, but it is up to the AppSec leadership to use the information and make data-driven decisions—both in running the day-to-day operations of the AppSec program and in managing the understanding of risk for the organization as a whole.