There are at least six types of open-source library vulnerabilities that we should all be concerned about. Before describing them it is worth reiterating that simply linking to a vulnerable library in your project doesn’t mean your application will have a vulnerability. You will only have a vulnerability if you are using the vulnerable methods of the vulnerable library in a vulnerable manner. This is important. To know this for sure you need to look at the call-graph of your application and see if there is a call-chain to the vulnerable method in the dependency graph.
Listing vulnerable libraries is easy but actually determining a vulnerability is hard.
For every vulnerable library being used we see a very small percentage (single digits) of those projects actually using the vulnerable methods and have a strong hunch that across all developers this is likely to actually be a tiny percentage (i.e. 0.1 %). Another way to read that is by just showing vulnerable libraries there is 'maybe' a 99.9% false positive rate. I will follow up on that hunch with real stats soon.
The six types of open-source library vulnerabilities are:
- Disclosed - a vulnerability where information is available in public places such as change-logs, commit-logs, issue-trackers, mailing-lists and vulnerability databases (including CVE’s in the NVD). (1)(2)
- Inherited - a new vulnerability that is the result of a library inheriting a library with a disclosed vulnerability via its dependency & call graph (both conditions needed).(3)
- Embedded - a new vulnerability that is the result of inheriting a library with a disclosed vulnerability from embedded code (usually as a result of cut-and-paste).(4)
- Similar - a new vulnerability that is the same or similar to a previously disclosed vulnerability but that is now found in a different library.
- Reintroduced - the same disclosed vulnerability that has been fixed in a previous release of the library but that has been reintroduced in a later version of the same library.(5)
- Zero Days - new issues that have not yet been the subject of public disclosure but known about by someone and likely being used by the bad guys in the wild.
We estimate that the total gene pool of open-source library vulnerabilities is likely made up of something like Type 1: 5-10 % (and CVE's are < 5% of this 5-10%), Type 2: 20-30%, Type 3: 20-30%, Type 4: 2-3% and Type 5: 50%.
References
- CVE's make up a small percentage of the disclosed vulnerabilities
- Tracking disclosed issues requires you to be good at scavenging on the Internet (or have a team working on using Natural Language Processing to track these things at scale)
- This means the full dependency graph i.e. the direct and transitive dependencies like Apache Commons Collections
- Like these issues we disclosed in Handlebars
- Like this issue we disclosed in Spring Social