As a Data Scientist at SourceClear I get to analyze lots of interesting vulnerability data as well as anonymized project data. New customers often ask us what "normal" looks like when it comes to vulnerabilities in their projects, so I thought I'd take a look and share a few insights.
How many projects have vulnerabilities, and how many do they usually have?
I looked at projects analyzed with SourceClear and broke them out by language. Unsurprisingly most projects have a handful of vulnerabilities. 80% of JavaScript projects have vulnerabilities, with an average of 7 vulnerabilities per project, while almost 60% of Java projects, with a comparably more robust history with security tools, still have vulnerabilities.
What's up with all these dependencies?
When we analyze your projects, we first build a full dependency graph to see what libraries are in use. Those libraries you specify are called 'direct' dependencies, but that's not everything. Your dependencies have dependencies, called 'transitive' dependencies. Your package manager resolves this whole graph until you've got dozens, sometimes hundreds of libraries inside your projects.
How do these direct vs. transitive libraries break down by language? Java projects look pretty tame with a smaller number of dependencies. JavaScript projects are another story. For every JavaScript dependency you add, you end up with about 8 others coming along for the ride, with an average of 350 dependencies total.
What are the most popular dependencies?
With every app so full of dependencies, lets take a look to see which ones are the most popular - both direct and transitive ones.
Top Java Dependencies
Direct | Transitive |
---|---|
guava (29%) | slf4j (46%) |
avro (29%) | jackson (45%) |
log4j (26%) | jackson datamapper (43%) |
Top JavaScript Dependencies
Direct | Transitive |
---|---|
mocha (29%) | inherits (86%) |
express (29%) | minimatch (80%) |
eslint (20%) | ms (79%) |
OK, what does this mean? Well, according to our data, 86% of JavaScript applications rely on the inherits
library, for example, mostly as a transitive dependency. The top transitive libraries appear in at least 70% of projects: inherits
, ms
, minimatch
, mkdirp
, minimist
, isarray
, core-utils-is
. That means 70% of JavaScript projects rely on these libraries, possibly without even knowing it.
Any one of these libraries could become the next leftpad, breaking the majority of JavaScript projects with a single vulnerability.
Where do vulnerabilities typically come from?
In every language but Python (oddly enough) - most vulnerabilities are introduced through transitive dependencies.
Thanks
Drop us a line if there's something else you'd like us to dig into. Of course you can analyze your own projects with SourceClear too, to see what dependencies lurk in your projects, and which of them may be vulnerable.