By Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, and Katsuro Inoue.
In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). 2017.
Paper / Conference
The third and final paper in this mini-series on dependencies taken from recent ESEC/FSE events. This study looks solely at Java projects using Maven, and investigates how the Common Vulnerabilities and Exposures (CVE) project affects library migration. The authors looked at 4,659 projects, performed 8 case studies, and surveyed developers of projects with outdated (and vulnerable) dependencies.
The starting assumption is that developers don’t want dependencies with known security vulnerabilities:
We conjecture that for developers, the awareness of the security advisory is more important than the migration effort needed to migrate the vulnerable dependency.
But this reasonable-sounding assumption doesn’t seem to be backed up by the evidence:
In 2014, Heartbleed, Poodle, Shellshock, — all high profile library vulnerabilities were found to have affected a significant portion of the software industry. In that same year, Sonatype determined that over 6% of the download requests from the Maven Central repository were for component versions that included known vulnerabilities. The company reported that in review of over 1,500 applications, each of them had an average of 24 severe or critical flaws inherited from their components.
So given that many developers don’t appear to consider these issues severe enough to merit upgrading their dependencies (or are unaware of the issues entirely), three research questions are formulated:
- To what extent are developers updating their library dependencies?
- What is the response to important awareness mechanisms such as a new release announcement and a security advisory on library updates?
- Why are developers non responsive to a security advisory?
Library usage To track library migrations, the authors first define what exactly a migration is. They use L(name, version) to refer to a library and S(name, version) to refer to a system. When a system S(a,b) migrates to a library L(x,y), it creates a dependency between them.
This model can track how frequently developers migrate their dependencies, and how many migrations occurred in a single system version update. The authors also define “library usage” as the number of systems depending on a library at a given point in time.
The authors selected 4,659 Java projects (filtered down from 10,523) from GitHub which (a) have more than 100 commits; (b) have a commit between January and November of 2015; (c) are not duplicates (determined by project name); and (d) use Maven. As a project may contain multiple systems, the authors then extracted 48,495 systems from the 4,659 projects, and found 852,322 total migrations.
For example, here we see a library migration plot for the Apache Commons beanutils library, for which CVE-2014-0114 was published in April 2014 (the dashed black line). Don’t read too much into the curves:
The LMP shows LU changes in the library (y-axis) with respect to time (x-axis). It is important to note that the LMP curve itself should not be taken at face value, as the smoothing algorithm is generated by a predictive model and it is not a true reflection of all data points.
The authors use these library migration plots to judge how developers respond to announcements of new releases and CVEs. The authors pick eight libraries in particular to examine, based on library usage trends. For each, they looked at the online documentation and version numbering to judge the effort required to perform the migration. The releases selected are:
- google-guava (16.0.1, 17.0, and 18.0)
- junit (3.8.1, 4.10, 4.11)
- log4j (1.2.15, 1.2.16, 1.2.17)
- commons-beanutils (1.9.1, 1.9.2)
- commons-fileupload (1.2.2, 1.3, 1.3.1)
- commons-httpclient (3.1, 4.2.2)
- httpcomponents (4.2.2, 4.2.3, 4.2.5)
- commons-compress (1.4, 1.4.1)
Finally, the authors send a short email survey to developers of projects which are non-responsive to a CVE, asking if they are aware of the vulnerability, and why they haven’t updated. They received a total of 16 responses.
Library migration in practice Firstly the authors look at this from a systems perspective.
We discover that systems tend to have a lot of library dependencies, but do not perform many dependency updates (“DUs”) when they make a new release. We also see that there is just about no correlation between the number of library dependencies and the number of dependency updates.
This result confirms the hypothesis that the number of library dependencies in a system does not influence the frequency of updates.
The authors then look at this from a library perspective, and find that library versions tend to slowly reach peak usage, which then steadily declines as developers migrate away. However, many systems remain with outdated dependencies, such as L(log4j, 1.2.15), with 98% of systems having not migrated away from it at the time of the study.
To answer (RQ1): (i) although system heavily depend on libraries, most systems rarely update their libraries and (ii) systems are less likely migrate their library dependencies, with 81.5% of systems remaining with a popular older version.
Developer responsiveness to awareness mechanisms The authors examine library usage trends to judge how developers respond to awareness mechanisms such as new releases and CVEs.
The authors speculate that, in this case, migrating between google-guava versions is fairly easy, which influences the quick change:
We find that the reasons for consistent migration trends are mainly related to the estimated migration effort required to complete the migration process. Through inspection of the online documentation, we find that migration from L(NR1, 16.0.1) to L(NR1, 17.0) contains 10 changed packages. Similarly, migration from L(NR1, 17.0) to L(NR1, 18.0) also contained 7 changed packages. Yet, all three library versions require the same Java 5 environment which indicates no significant changes to the overall architectural design of the library. From the documentation, we deduce that popular use of L(NR1, 18.0) is due to the prolonged period between the next release of L(NR1, 19.0), which is more that a year after the release of L(NR1, 18.0) in December 10, 2015
For junit, migration is more challenging, which may contribute to the prolonged usage of older versions.
Similar to the consistent migration to a new release, we find that the reason for a non response to a migration opportunity is related to the estimated migration effort. For instance, as shown in Figure 8(b), the newer Junit version 4 series libraries requires a change of platform to Java 5 or higher (L(NR2, 4.10) and L(NR2, 4.11)), inferring significant changes to the architectural design of the library. Intuitively, we see that even though L(NR2, 3.8.1) is older, it still maintains its maximum library usage (i.e., current LU and peak LU=342).
As new releases are made, library usage begins to gradually trend down. This seems reasonable: having newer dependencies is nice, but not essential.
For vulnerabilities, however, we would like to see a much sharper decline. The commons-beanutils library shows this pattern. A vulnerability is announced, a release is made shortly afterwards, and the new version rapidly gains in popularity. Developers even appear to begin migrating away from the vulnerable version before the new one is released, in this case.
Unfortunately, it’s not always the case. Here we see that a vulnerability was announced in the commons-httpclient library, but its usage kept growing. The authors speculate that this is because the migration effort was too high: there was no new release of commons-httpclient, rather the library was deprecated in favour of the new httpcomponents library:
The estimated migration effort and the lack of a viable replacement dependency are some of the possible reasons why affected maintainers show no response to the security advisory. This is shown in the case of the Httpcomponents library, which is the successor and replacement for commons-httpclient library. As documented, Httpcomponents is a major upgrade with many architectural design modifications compared to the older commons-httpclient dependency versions.
Developer feedback on vulnerable dependencies Finally, we get the results of the survey. Of the 16 responses, 11 (69%) were unaware that there was a vulnerability at all! However, in some cases, vulnerable dependencies are not exposed in a way which introduces a security hole at all. One developer noted:
It’s only a test scoped dependency which means that it’s not a transitive dependency for users of XXX so there is no harm done. XXX has no external compile scoped dependencies thus there is no real need to update dependencies.
Some developers seem to view upgrading dependencies as a luxury which they can’t afford:
I subscribed to the CVE RSS recently and I don’t check it regularly, so even if I might have heard of the current vulnerability, I simply forgot to address it. We also had some emergencies recently (developing features for our customers), that makes the security issues less prio than releasing the ordered features :-/ … Anyway, our security approach is far from perfect, I am aware of it, and I’m willing to improve this, but sometimes it is difficult to explain our customers that it is a main point to consider in the development process.
Unfortunately, customers and users are often unsympathetic to things without an immediate impact. If a security hole isn’t causing problems now, even if it might in the future, then they want new features rather than better security.
Dependencies are hard! Library usage is not only common, but encouraged as good practice. Yet 81% of the systems the authors surveyed use outdated dependencies. Even when there is a published security issue, developers often do not migrate. Updating dependencies is considered something nice to do in your spare time, but not really a focus. This is not a great situation.
The study provides motivation for our community develop strategies to improve a developer personal perception of third-party updates, especially in cases when effort must be allocated to mitigate a severe vulnerability risk. Visual aids such as the Library Migration Plots (LMP) provide a rich visual analysis, which proves to be a useful awareness and motivation for developers to identify dependency migration opportunities. We envision this work as a contribution toward developing strategies and support tools that aid the management of third-party dependencies.