Do Developers Update Their Library Dependencies?

Date
Tags esec, fse, paper summary, research
Target Audience Computer science people.

By Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, and Kat­suro Inoue.
In Joint Meeting of the European Soft­ware En­gin­eering Con­fer­ence and the ACM SIG­SOFT Sym­posium on the Found­a­tions of Soft­ware En­gin­eering (ESEC/F­SE). 2017.
Paper / Con­fer­ence

The third and final paper in this min­i-series on de­pend­en­cies taken from re­cent ES­EC/FSE events. This study looks solely at Java pro­jects using Maven, and in­vest­ig­ates how the Common Vul­ner­ab­il­ities and Ex­pos­ures (CVE) pro­ject af­fects lib­rary mi­gra­tion. The au­thors looked at 4,659 pro­jects, per­formed 8 case stud­ies, and sur­veyed de­velopers of pro­jects with out­dated (and vul­ner­able) de­pend­en­cies.

The starting as­sump­tion is that de­velopers don’t want de­pend­en­cies with known se­curity vul­ner­ab­il­it­ies:

We con­jec­ture that for de­velopers, the aware­ness of the se­curity ad­visory is more im­portant than the mi­gra­tion ef­fort needed to mi­grate the vul­ner­able de­pend­ency.

But this reas­on­able-­sounding as­sump­tion doesn’t seem to be backed up by the evid­ence:

In 2014, Heart­bleed, Poodle, Shell­shock, — all high pro­file lib­rary vul­ner­ab­il­ities were found to have af­fected a sig­ni­ficant por­tion of the soft­ware in­dustry. In that same year, Son­a­type de­term­ined that over 6% of the down­load re­quests from the Maven Central re­pos­itory were for com­ponent ver­sions that in­cluded known vul­ner­ab­il­it­ies. The com­pany re­ported that in re­view of over 1,500 ap­plic­a­tions, each of them had an av­erage of 24 severe or crit­ical flaws in­her­ited from their com­pon­ents.

So given that many de­velopers don’t ap­pear to con­sider these is­sues severe enough to merit up­grading their de­pend­en­cies (or are un­aware of the is­sues en­tirely), three re­search ques­tions are for­mu­lated:

Lib­rary usage To track lib­rary mi­gra­tions, the au­thors first define what ex­actly a mi­gra­tion is. They use L(­name, ver­sion) to refer to a lib­rary and S(­name, ver­sion) to refer to a sys­tem. When a system S(a,b) mi­grates to a lib­rary L(x,y), it cre­ates a de­pend­ency between them.

Library migration between systems and libraries

Lib­rary mi­gra­tion between sys­tems and lib­raries

This model can track how fre­quently de­velopers mi­grate their de­pend­en­cies, and how many mi­gra­tions oc­curred in a single system ver­sion up­date. The au­thors also define “lib­rary us­age” as the number of sys­tems de­pending on a lib­rary at a given point in time.

The au­thors se­lected 4,659 Java pro­jects (filtered down from 10,523) from GitHub which (a) have more than 100 com­mits; (b) have a commit between January and November of 2015; (c) are not du­plic­ates (de­termined by pro­ject name); and (d) use Maven. As a pro­ject may con­tain mul­tiple sys­tems, the au­thors then ex­tracted 48,495 sys­tems from the 4,659 pro­jects, and found 852,322 total mi­gra­tions.

Library migration for L(beanutils,1.9.1) and L(beanutils,1.9.2)

Lib­rary mi­gra­tion for L(­bea­nutils,1.9.1) and L(­bea­nutils,1.9.2)

For ex­ample, here we see a lib­rary mi­gra­tion plot for the Apache Com­mons bea­nutils lib­rary, for which CVE-2014-0114 was pub­lished in April 2014 (the dashed black line). Don’t read too much into the curves:

The LMP shows LU changes in the lib­rary (y-axis) with re­spect to time (x-ax­is). It is im­portant to note that the LMP curve it­self should not be taken at face value, as the smoothing al­gorithm is gen­er­ated by a pre­dictive model and it is not a true re­flec­tion of all data points.

The au­thors use these lib­rary mi­gra­tion plots to judge how de­velopers re­spond to an­nounce­ments of new re­leases and CVEs. The au­thors pick eight lib­raries in par­tic­ular to ex­am­ine, based on lib­rary usage trends. For each, they looked at the on­line doc­u­ment­a­tion and ver­sion num­bering to judge the ef­fort re­quired to per­form the mi­gra­tion. The re­leases se­lected are:

Fi­nally, the au­thors send a short email survey to de­velopers of pro­jects which are non-re­sponsive to a CVE, asking if they are aware of the vul­ner­ab­il­ity, and why they haven’t up­dated. They re­ceived a total of 16 re­sponses.

Lib­rary mi­gra­tion in prac­tice Firstly the au­thors look at this from a sys­tems per­spect­ive.

The dependencies and updates of systems

The de­pend­en­cies and up­dates of sys­tems

We dis­cover that sys­tems tend to have a lot of lib­rary de­pend­en­cies, but do not per­form many de­pend­ency up­dates (“­DUs”) when they make a new re­lease. We also see that there is just about no cor­rel­a­tion between the number of lib­rary de­pend­en­cies and the number of de­pend­ency up­dates.

This result con­firms the hy­po­thesis that the number of lib­rary de­pend­en­cies in a system does not in­flu­ence the fre­quency of up­dates.

The au­thors then look at this from a lib­rary per­spect­ive, and find that lib­rary ver­sions tend to slowly reach peak us­age, which then steadily de­clines as de­velopers mi­grate away. However, many sys­tems re­main with out­dated de­pend­en­cies, such as L(­lo­g4j, 1.2.15), with 98% of sys­tems having not mi­grated away from it at the time of the study.

To an­swer (R­Q1): (i) al­though system heavily de­pend on lib­rar­ies, most sys­tems rarely up­date their lib­raries and (ii) sys­tems are less likely mi­grate their lib­rary de­pend­en­cies, with 81.5% of sys­tems re­maining with a pop­ular older ver­sion.

De­veloper re­spons­ive­ness to aware­ness mech­an­isms The au­thors ex­amine lib­rary usage trends to judge how de­velopers re­spond to aware­ness mech­an­isms such as new re­leases and CVEs.

Library usage trends for consecutive releases of google-guava

Lib­rary usage trends for con­sec­utive re­leases of google-guava

The au­thors spec­u­late that, in this case, mi­grating between google-guava ver­sions is fairly easy, which in­flu­ences the quick change:

We find that the reasons for con­sistent mi­gra­tion trends are mainly re­lated to the es­tim­ated mi­gra­tion ef­fort re­quired to com­plete the mi­gra­tion pro­cess. Through in­spec­tion of the on­line doc­u­ment­a­tion, we find that mi­gra­tion from L(NR1, 16.0.1) to L(NR1, 17.0) con­tains 10 changed pack­ages. Sim­il­arly, mi­gra­tion from L(NR1, 17.0) to L(NR1, 18.0) also con­tained 7 changed pack­ages. Yet, all three lib­rary ver­sions re­quire the same Java 5 en­vir­on­ment which in­dic­ates no sig­ni­ficant changes to the overall ar­chi­tec­tural design of the lib­rary. From the doc­u­ment­a­tion, we de­duce that pop­ular use of L(NR1, 18.0) is due to the pro­longed period between the next re­lease of L(NR1, 19.0), which is more that a year after the re­lease of L(NR1, 18.0) in December 10, 2015

Library usage trends for consecutive releases of junit

Lib­rary usage trends for con­sec­utive re­leases of junit

For ju­nit, mi­gra­tion is more chal­len­ging, which may con­tribute to the pro­longed usage of older ver­sions.

Sim­ilar to the con­sistent mi­gra­tion to a new re­lease, we find that the reason for a non re­sponse to a mi­gra­tion op­por­tunity is re­lated to the es­tim­ated mi­gra­tion ef­fort. For in­stance, as shown in Figure 8(b), the newer Junit ver­sion 4 series lib­raries re­quires a change of plat­form to Java 5 or higher (L(NR2, 4.10) and L(NR2, 4.11)), in­fer­ring sig­ni­ficant changes to the ar­chi­tec­tural design of the lib­rary. In­tu­it­ively, we see that even though L(NR2, 3.8.1) is older, it still main­tains its max­imum lib­rary usage (i.e., cur­rent LU and peak LU=342).

As new re­leases are made, lib­rary usage be­gins to gradu­ally trend down. This seems reas­on­able: having newer de­pend­en­cies is nice, but not es­sen­tial.

Library usage trends for consecutive releases of commons-beanutils

Lib­rary usage trends for con­sec­utive re­leases of com­mon­s-­bea­nutils

For vul­ner­ab­il­it­ies, however, we would like to see a much sharper de­cline. The com­mon­s-­bea­nutils lib­rary shows this pat­tern. A vul­ner­ab­ility is an­nounced, a re­lease is made shortly af­ter­wards, and the new ver­sion rap­idly gains in pop­ular­ity. De­velopers even ap­pear to begin mi­grating away from the vul­ner­able ver­sion be­fore the new one is re­leased, in this case.

Library usage trends for commons-httpclient and httpcomponents

Lib­rary usage trends for com­mon­s-ht­tp­client and ht­tp­com­pon­ents

Un­for­tu­nately, it’s not al­ways the case. Here we see that a vul­ner­ab­ility was an­nounced in the com­mon­s-ht­tp­client lib­rary, but its usage kept grow­ing. The au­thors spec­u­late that this is be­cause the mi­gra­tion ef­fort was too high: there was no new re­lease of com­mon­s-ht­tp­cli­ent, rather the lib­rary was de­prec­ated in fa­vour of the new ht­tp­com­pon­ents lib­rary:

The es­tim­ated mi­gra­tion ef­fort and the lack of a vi­able re­place­ment de­pend­ency are some of the pos­sible reasons why af­fected main­tainers show no re­sponse to the se­curity ad­vis­ory. This is shown in the case of the Ht­tp­com­pon­ents lib­rary, which is the suc­cessor and re­place­ment for com­mon­s-ht­tp­client lib­rary. As doc­u­mented, Ht­tp­com­pon­ents is a major up­grade with many ar­chi­tec­tural design modi­fic­a­tions com­pared to the older com­mon­s-ht­tp­client de­pend­ency ver­sions.

De­veloper feed­back on vul­ner­able de­pend­en­cies Fi­nally, we get the res­ults of the sur­vey. Of the 16 re­sponses, 11 (69%) were un­aware that there was a vul­ner­ab­ility at all! However, in some cases, vul­ner­able de­pend­en­cies are not ex­posed in a way which in­tro­duces a se­curity hole at all. One de­veloper noted:

It’s only a test scoped de­pend­ency which means that it’s not a trans­itive de­pend­ency for users of XXX so there is no harm done. XXX has no ex­ternal com­pile scoped de­pend­en­cies thus there is no real need to up­date de­pend­en­cies.

Some de­velopers seem to view up­grading de­pend­en­cies as a luxury which they can’t af­ford:

I sub­scribed to the CVE RSS re­cently and I don’t check it reg­u­larly, so even if I might have heard of the cur­rent vul­ner­ab­il­ity, I simply forgot to ad­dress it. We also had some emer­gen­cies re­cently (devel­oping fea­tures for our cus­tom­ers), that makes the se­curity is­sues less prio than re­leasing the ordered fea­tures :-/ … Any­way, our se­curity ap­proach is far from per­fect, I am aware of it, and I’m willing to im­prove this, but some­times it is dif­fi­cult to ex­plain our cus­tomers that it is a main point to con­sider in the de­vel­op­ment pro­cess.

Un­for­tu­nately, cus­tomers and users are often un­sym­path­etic to things without an im­me­diate im­pact. If a se­curity hole isn’t causing prob­lems now, even if it might in the fu­ture, then they want new fea­tures rather than better se­cur­ity.

De­pend­en­cies are hard! Lib­rary usage is not only com­mon, but en­cour­aged as good prac­tice. Yet 81% of the sys­tems the au­thors sur­veyed use out­dated de­pend­en­cies. Even when there is a pub­lished se­curity is­sue, de­velopers often do not mi­grate. Up­dating de­pend­en­cies is con­sidered some­thing nice to do in your spare time, but not really a fo­cus. This is not a great situ­ation.

The study provides mo­tiv­a­tion for our com­munity de­velop strategies to im­prove a de­veloper per­sonal per­cep­tion of third-­party up­dates, es­pe­cially in cases when ef­fort must be al­loc­ated to mit­igate a severe vul­ner­ab­ility risk. Visual aids such as the Lib­rary Mi­gra­tion Plots (LMP) provide a rich visual ana­lysis, which proves to be a useful aware­ness and mo­tiv­a­tion for de­velopers to identify de­pend­ency mi­gra­tion op­por­tun­it­ies. We en­vi­sion this work as a con­tri­bu­tion to­ward de­vel­oping strategies and sup­port tools that aid the man­age­ment of third-­party de­pend­en­cies.