The US National Institute of Standards and Technology (NIST) has almost completely stopped analyzing new vulnerabilities (CVEs) listed in its National Vulnerability Database (NVD). Through the first six weeks of 2024, NIST analyzed over 3,500 CVEs with only 34 CVEs awaiting analysis.1 Since February 13th, however, nearly half (48%) of the 7,200 CVEs received this year by the NVD are still awaiting analysis.2 The number of CVEs analyzed has dropped nearly 80% to less than 750 CVEs analyzed. Other than a vague reference to establishing a consortium, the reasons behind this disruption remain a mystery.
Thankfully, Veracode customers need not worry about this disruption because they have access to Veracode’s proprietary database. Since the notice on February 13th, Veracode has released over 300 CVEs. Of these 300+, NVD has analyzed less than 15 of these CVEs. Read on to learn how Veracode SCA operates without NVD providing CVE analysis.
NVD Analysis
When a CVE Naming Authority (CNA) adds a CVE to the NVD, they are required to include information such as a URL referencing the vulnerability and a description summarizing the vulnerability. While some CNAs may provide additional information such as a CVSS score, it is not required, so as part of its analysis process, the NVD plays an important role in enriching the data associated with each CVE. For example, NVD will add a Common Weakness Enumeration (CWE) identifier, Common Vulnerability Scoring System (CVSS) exploitability and impact metrics, and a Common Product Enumerator (CPE) to help users identify product names and versions that are either vulnerable or patched.
Why You Should Not Rely Solely on the NVD
While the NVD provides an incredibly valuable service to society, it was never wise to rely on it as your sole source of open-source security data. Consider these three scenarios:
-
Silent Fixes. What if an open-source library contributor fixes a security vulnerability and does not tell anyone? Veracode sees these “silent fixes” all the time in commits in GitHub and in bug reports in Bugzilla.
-
Delayed Disclosures. What if the vulnerability is not reported in a timely manner? For example, CVE-2018-11776, an Apache Struts Remote Code Execution vulnerability, similar to the one that led to the 2017 Equifax breach, was patched in April 2018 but not disclosed until August of that same year, giving hackers four months to take advantage of vulnerable software.
-
Systemic Delays. If a vulnerability is in fact reported, what about the fact that it can take weeks or months to go through the CNAs’ process of vetting and assigning a CVE and then several more weeks for NVD to go through its analysis process? Since all this information is public, these systemic delays give hackers plenty of time to exploit these vulnerabilities.
Veracode’s Proprietary Database
For over a decade, Veracode has cultivated a database that includes not only the open-source vulnerabilities in the NVD, but also undisclosed vulnerabilities in open-source libraries. Sometimes these vulnerabilities fall into the previously described “silent fixes” or “delayed disclosures” categories, but often they fall in the “systemic delays” category. This is why customers see “Reserved CVEs” in Veracode’s database. A CNA will assign a CVE ID to a vulnerability and give it a “Reserved” status when they need more details about the vulnerability or are waiting for upstream developers to integrate the fix before announcing the vulnerability to the world. This is known as the “embargo period” in responsible disclosure. Veracode will not publish information about embargoed CVEs unless its security research team finds reputable information about a given reserved CVE. The bottom line is that Veracode customers often do not need to wait for this process to complete because Veracode researchers can gather those details and add the vulnerability to Veracode’s database, giving users the opportunity to fix these vulnerabilities before the CVE is analyzed by NVD.
Sources
The NVD is not Veracode’s only source of vulnerability data. The table below lists other sources that Veracode’s researchers utilize.
Source |
Details |
OSV |
Data from osv.dev includes vulnerabilities from GitHub Security Advisories and many other sources. |
GitHub issues and commits |
Aided by machine learning to find issues and commits related to vulnerabilities, Veracode researchers monitor popular library commit and issue history to reveal nondisclosed vulnerabilities. |
Bug trackers |
Aided by machine learning to discern bugs from vulnerabilities, Veracode researchers monitor public Bugzilla and Jira tickets to find vulnerabilities. |
Mailing lists |
Veracode researchers subscribe to various open-source software security mailing lists where maintainers and researchers discuss potential vulnerabilities. Again, machine learning is utilized to filter and prioritize. |
Product Advisories and Announcements |
To cover times when a vulnerability is only disclosed via a vendor/library security advisory page, Veracode scrapes vendor disclosure pages. |
Machine Learning
Long before they became buzzwords, Veracode has been using machine learning to automate the identification of potential security vulnerabilities from commit messages and bug reports. In open-source projects, bugs are typically tracked with issue trackers, and code changes are merged in the form of commits to source control repositories. Veracode’s vulnerability triage system uses natural language processing and real machine learning to identify potential vulnerabilities in open-source libraries with a high level of accuracy. By analyzing the patterns found in past commit messages and bug-tracking issues using machine learning, the model can identify when new commits or bug issues resemble a silent fix of a potential vulnerability. These potential vulnerabilities are then raised to Veracode’s security research team.
Research Team
The security research team is responsible for triaging the data provided by the machine learning system and reviewing each potential vulnerability for accuracy, and if any false positives are discovered, the team adds this feedback into the system to continuously improve the model over time. For true positives, the curation process involves the following steps:
-
Determining the vulnerable range. While disclosures sometimes include the version in which the vulnerability was first introduced and the version containing the fix, this information is often missing, so researchers will review the commit history to find when the vulnerability was first introduced and evidence of the fix.
-
Verifying the fix is released. Researchers will download the “fixed” version from the respective package manager and confirm that it contains the fixed code.
-
Identifying vulnerable methods. Researchers will add which public methods are affected by the vulnerability to the database so that Veracode’s SCA agent scanner can detect when these methods are called by your code.
-
Adding details. Rather than copy the description of the vulnerability directly from the disclosure or CVE, researchers will often rephrase it to follow Veracode’s description format and to provide extra details that was discovered after reading the source code.
-
Scoring. At this point, the team will have a thorough understanding of the prerequisites for the attack and how the vulnerability works from reading the source code. The researcher assigned to the vulnerability will calculate CVSS score using the same calculator used by the NVD, and another researcher will conduct a quality assurance review to confirm that the score is correct before publishing it to the Veracode vulnerability database.
Vulnerable Methods
SCA agent scans create a call graph to see how data and controls flow through an application. If data flows through part of an open-source library that Veracode’s security research team has identified as vulnerable, SCA will indicate to developers that their application has a vulnerable method, which opens their application to exploits. The reason detection of vulnerable methods is so important is because it gives developers a clear picture of what needs to be fixed first. With over 56% of CVEs having a CVSS rating of high or critical,3 using severity alone to prioritize is insufficient.
It’s worth noting that some SCA vendors claim to detect “reachability,” which involves checking whether first party code reaches the vulnerable library, but Veracode’s vulnerable method analysis can pinpoint whether first party code calls the actual vulnerable method inside the library.
License Risks
Unlike the NVD, Veracode’s vulnerability database contains more than just vulnerabilities; it tracks also tracks licenses that are attached to third-party, open-source components and the risks associated with those licenses. These risks are identified during an SCA scan and can help organizations avoid issues related to copyleft licenses.
References
1 Internet Archive, https://web.archive.org/web/20240212021622/https:/nvd.nist.gov/general/nvd-dashboard
2 NVD Dashboard, https://nvd.nist.gov/general/nvd-dashboard
3 CVE Details, https://www.cvedetails.com
Special Thanks
Special thanks to Peter Monaghan, Veracode Product Manager, for valuable contributions to this piece.