barrucadu's memos - Research (Summaries)

Do Developers Update Their Library Dependencies?

2017-12-12T00:00:00Z

By Raula Gaikovina Kula, Daniel M. German, Ali Ouni, Takashi Ishio, and Katsuro Inoue.
In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). 2017.
Paper / Conference

The third and final paper in this mini-series on dependencies taken from recent ESEC/FSE events. This study looks solely at Java projects using Maven, and investigates how the Common Vulnerabilities and Exposures (CVE) project affects library migration. The authors looked at 4,659 projects, performed 8 case studies, and surveyed developers of projects with outdated (and vulnerable) dependencies.

The starting assumption is that developers don’t want dependencies with known security vulnerabilities:

We conjecture that for developers, the awareness of the security advisory is more important than the migration effort needed to migrate the vulnerable dependency.

But this reasonable-sounding assumption doesn’t seem to be backed up by the evidence:

In 2014, Heartbleed, Poodle, Shellshock, — all high profile library vulnerabilities were found to have affected a significant portion of the software industry. In that same year, Sonatype determined that over 6% of the download requests from the Maven Central repository were for component versions that included known vulnerabilities. The company reported that in review of over 1,500 applications, each of them had an average of 24 severe or critical flaws inherited from their components.

So given that many developers don’t appear to consider these issues severe enough to merit upgrading their dependencies (or are unaware of the issues entirely), three research questions are formulated:

To what extent are developers updating their library dependencies?
What is the response to important awareness mechanisms such as a new release announcement and a security advisory on library updates?
Why are developers non responsive to a security advisory?

Library usage To track library migrations, the authors first define what exactly a migration is. They use L(name, version) to refer to a library and S(name, version) to refer to a system. When a system S(a,b) migrates to a library L(x,y), it creates a dependency between them.

This model can track how frequently developers migrate their dependencies, and how many migrations occurred in a single system version update. The authors also define “library usage” as the number of systems depending on a library at a given point in time.

The authors selected 4,659 Java projects (filtered down from 10,523) from GitHub which (a) have more than 100 commits; (b) have a commit between January and November of 2015; (c) are not duplicates (determined by project name); and (d) use Maven. As a project may contain multiple systems, the authors then extracted 48,495 systems from the 4,659 projects, and found 852,322 total migrations.

For example, here we see a library migration plot for the Apache Commons beanutils library, for which CVE-2014-0114 was published in April 2014 (the dashed black line). Don’t read too much into the curves:

The LMP shows LU changes in the library (y-axis) with respect to time (x-axis). It is important to note that the LMP curve itself should not be taken at face value, as the smoothing algorithm is generated by a predictive model and it is not a true reflection of all data points.

The authors use these library migration plots to judge how developers respond to announcements of new releases and CVEs. The authors pick eight libraries in particular to examine, based on library usage trends. For each, they looked at the online documentation and version numbering to judge the effort required to perform the migration. The releases selected are:

google-guava (16.0.1, 17.0, and 18.0)
junit (3.8.1, 4.10, 4.11)
log4j (1.2.15, 1.2.16, 1.2.17)
commons-beanutils (1.9.1, 1.9.2)
commons-fileupload (1.2.2, 1.3, 1.3.1)
commons-httpclient (3.1, 4.2.2)
httpcomponents (4.2.2, 4.2.3, 4.2.5)
commons-compress (1.4, 1.4.1)

Finally, the authors send a short email survey to developers of projects which are non-responsive to a CVE, asking if they are aware of the vulnerability, and why they haven’t updated. They received a total of 16 responses.

Library migration in practice Firstly the authors look at this from a systems perspective.

We discover that systems tend to have a lot of library dependencies, but do not perform many dependency updates (“DUs”) when they make a new release. We also see that there is just about no correlation between the number of library dependencies and the number of dependency updates.

This result confirms the hypothesis that the number of library dependencies in a system does not influence the frequency of updates.

The authors then look at this from a library perspective, and find that library versions tend to slowly reach peak usage, which then steadily declines as developers migrate away. However, many systems remain with outdated dependencies, such as L(log4j, 1.2.15), with 98% of systems having not migrated away from it at the time of the study.

To answer (RQ1): (i) although system heavily depend on libraries, most systems rarely update their libraries and (ii) systems are less likely migrate their library dependencies, with 81.5% of systems remaining with a popular older version.

Developer responsiveness to awareness mechanisms The authors examine library usage trends to judge how developers respond to awareness mechanisms such as new releases and CVEs.

The authors speculate that, in this case, migrating between google-guava versions is fairly easy, which influences the quick change:

We find that the reasons for consistent migration trends are mainly related to the estimated migration effort required to complete the migration process. Through inspection of the online documentation, we find that migration from L(NR1, 16.0.1) to L(NR1, 17.0) contains 10 changed packages. Similarly, migration from L(NR1, 17.0) to L(NR1, 18.0) also contained 7 changed packages. Yet, all three library versions require the same Java 5 environment which indicates no significant changes to the overall architectural design of the library. From the documentation, we deduce that popular use of L(NR1, 18.0) is due to the prolonged period between the next release of L(NR1, 19.0), which is more that a year after the release of L(NR1, 18.0) in December 10, 2015

For junit, migration is more challenging, which may contribute to the prolonged usage of older versions.

Similar to the consistent migration to a new release, we find that the reason for a non response to a migration opportunity is related to the estimated migration effort. For instance, as shown in Figure 8(b), the newer Junit version 4 series libraries requires a change of platform to Java 5 or higher (L(NR2, 4.10) and L(NR2, 4.11)), inferring significant changes to the architectural design of the library. Intuitively, we see that even though L(NR2, 3.8.1) is older, it still maintains its maximum library usage (i.e., current LU and peak LU=342).

As new releases are made, library usage begins to gradually trend down. This seems reasonable: having newer dependencies is nice, but not essential.

For vulnerabilities, however, we would like to see a much sharper decline. The commons-beanutils library shows this pattern. A vulnerability is announced, a release is made shortly afterwards, and the new version rapidly gains in popularity. Developers even appear to begin migrating away from the vulnerable version before the new one is released, in this case.

Unfortunately, it’s not always the case. Here we see that a vulnerability was announced in the commons-httpclient library, but its usage kept growing. The authors speculate that this is because the migration effort was too high: there was no new release of commons-httpclient, rather the library was deprecated in favour of the new httpcomponents library:

The estimated migration effort and the lack of a viable replacement dependency are some of the possible reasons why affected maintainers show no response to the security advisory. This is shown in the case of the Httpcomponents library, which is the successor and replacement for commons-httpclient library. As documented, Httpcomponents is a major upgrade with many architectural design modifications compared to the older commons-httpclient dependency versions.

Developer feedback on vulnerable dependencies Finally, we get the results of the survey. Of the 16 responses, 11 (69%) were unaware that there was a vulnerability at all! However, in some cases, vulnerable dependencies are not exposed in a way which introduces a security hole at all. One developer noted:

It’s only a test scoped dependency which means that it’s not a transitive dependency for users of XXX so there is no harm done. XXX has no external compile scoped dependencies thus there is no real need to update dependencies.

Some developers seem to view upgrading dependencies as a luxury which they can’t afford:

I subscribed to the CVE RSS recently and I don’t check it regularly, so even if I might have heard of the current vulnerability, I simply forgot to address it. We also had some emergencies recently (developing features for our customers), that makes the security issues less prio than releasing the ordered features :-/ … Anyway, our security approach is far from perfect, I am aware of it, and I’m willing to improve this, but sometimes it is difficult to explain our customers that it is a main point to consider in the development process.

Unfortunately, customers and users are often unsympathetic to things without an immediate impact. If a security hole isn’t causing problems now, even if it might in the future, then they want new features rather than better security.

Dependencies are hard! Library usage is not only common, but encouraged as good practice. Yet 81% of the systems the authors surveyed use outdated dependencies. Even when there is a published security issue, developers often do not migrate. Updating dependencies is considered something nice to do in your spare time, but not really a focus. This is not a great situation.

The study provides motivation for our community develop strategies to improve a developer personal perception of third-party updates, especially in cases when effort must be allocated to mitigate a severe vulnerability risk. Visual aids such as the Library Migration Plots (LMP) provide a rich visual analysis, which proves to be a useful awareness and motivation for developers to identify dependency migration opportunities. We envision this work as a contribution toward developing strategies and support tools that aid the management of third-party dependencies.

Why Do Developers Use Trivial Packages?

2017-12-04T00:00:00Z

By Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab.
In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). 2017.
Paper / Conference

We saw last time that developers are often wary of introducing new dependencies unless they’re really worth it, due to the inevitable cost of maintenance. Why then do developers also depend on so-called “trivial packages”? The left-pad fiasco of last year brought to light how extreme this situation really is: a package providing 11 lines of code to left pad a string was pulled from npm, breaking thousands of other packages which, directly or indirectly, depended on it.

This is the question which this survey paper sets out to answer. Firstly we get some quantitative analysis of trivial package use across 230,000 npm packages and 38,000 applications, then a survey with 88 Node.js developers trivial packages.

What do we mean by a “trivial package”? The authors randomly selected 16 npm packages with between 4 and 250 lines of code and sent out a survey, which got 12 responses, asking whether each package was trivial or not, and why. Here’s an example, the is-positive package:

module.exports = function (n) {
  return toString.call(n) === '[object Number]' && n > 0;
};

Based on the survey responses, the authors identified both length and cyclomatic complexity of a package to be contributing factors to its triviality:

Our survey indicates that size and complexity are commonly used measures to determine if a package is trivial. Based on our analysis, packages that have ≤ 35 JavaScript LOC and a McCabe’s cyclomatic complexity ≤ 10 are considered to be trivial.

You can quibble over this definition (I might consider a longer but low-complexity package to be trivial, for instance), but triviality is ultimately a judgement call. No matter what metric the authors pick, there will be some who disagree.

How prevalent are they? The authors fetched the latest version of every npm package as of the 5th of May 2016, giving 231,092 packages, after removing 21,904 with no code. They also fetched all Node.js/npm applications on GitHub, giving 38,807 applications, after filtering out 76,814 with fewer than 100 commits or only one developer.

Of the npm packages, an incredible 28,845 (16.8%) are trivial packages. Furthermore, if we look at the proportion of published trivial packages over time, we see that it’s going up! This graph is jagged, up until npm banned unpublishing packages in response to the left-pad incident. I suspect this means that a lot of people used to publish, and then almost immediately remove, trivial packages. Currently, roughly 15% of the packages added each month are trivial packages.

Rather than looking at the entire database of packages, we can also look at the most popular:

npm posts the most depended-upon packages on its website. We measured the number of trivial packages that exist in the top 1,000 most depended-upon packages; we find that 113 of them are trivial packages. This finding shows that trivial packages are not only prevalent and increasing in number, but they are also very popular among developers, making up 11.3% of the 1,000 most depended on npm packages.

When it comes to applications, the authors parsed the source code, looking for import statements, to handle cases where a project’s package.json file (containing metadata for npm to build and run it) specifies a dependency which isn’t used anywhere. This gives, for each application, a set of dependencies which are used:

Finally, we measured the number of packages that are trivial in the set of packages used by the applications. Note that we only consider npm packages since it is the most popular package manager for Node.js packages and other package managers only manage a subset of packages. We find that of the 38,807 applications in our data set, 4,256 (10.9%) directly depend on at least one trivial package.

How do developers feel about them? Given how popular trivial packages are, we might suspect that developers don’t consider them a problem. This is in sharp contrast to some viewpoints in How to Break an API, where developers were wary of introducing new dependencies. This part of the study was conducted as a survey of 88 developers.

The reasons given are:

Trivial packages provide well implemented and tested code (48 respondents)
Use of trivial packages increases productivity (42 respondents)
Use of trivial packages outsources the maintenance burden for that code to the package authors (8 respondents)
Use of trivial packages helps readability and reduces complexity (8 respondents)
Use of a trivial package, over a large library or framework, improves application performance (3 respondents)

Only 7 respondents said they saw no reason to use trivial packages.

The authors also asked for the drawbacks of using trivial packages. Now we get some viewpoints closer to How to Break an API. The drawbacks given are:

The overhead of monitoring dependencies for updates (49 respondents)
The maintenance burden of breaking changes (16 respondents)
Decreased build performance, due to the overhead of fetching and building more dependencies (14 respondents)
Decreased developer performance, due to needing to read more documentation (11 respondents)
A missed learning opportunity: it’s easier to use a package to solve a problem than to figure it out yourself (8 respondents)
Potential security risks in third-party code (7 respondents)
Licensing issues (3 respondents)

Only 7 respondents said they saw no drawbacks to using trivial packages.

Are they well tested? Over half of the respondents said that a reason to use trivial packages is that the code is perceived to be well implemented and tested. But is that really the case?

npm requires that developers provide a test script name with the submission of their packages (listed in the package.json file). In fact, 81.2% (31,521 out of 38,845) of the trivial packages in our dataset have some test script name listed. However, since developers can provide any script name under this field, it is dificult to know if a package is actually tested.

So the authors turn to the npms tool to collect metrics about the trivial packages in their dataset:

We examine whether a package is really well tested and implemented from two aspects; first, we check if a package has tests written for it. Second, since in many cases, developers consider packages to be ‘deployment tested’, we also consider the usage of a package as an indicator of it being well tested and implemented. To carefully examine whether a package is really well tested and implemented, we use the npm online search tool (known as npms) to measure various metrics related to how well the packages are tested, used and valued. To provide its ranking of the packages, npms mines and calculates a number of metrics based on development (e.g., tests) and usage (e.g., no. of downloads) data.

They used three npms metrics to evaluate how tested a package is:

“Tests”, a weighted sum of the size of the tests, the coverage percentage, and the build status
“Community interest”, derived from popularity on GitHub
“Download count”, the number of downloads in the last three months

The results are not so promising:

As an initial step, we calculate the number of trivial packages that have a Tests value greater than zero, which means trivial packages that have some of tests. We find that only 45.2% of the trivial packages have tests, i.e., a Tests value > 0.

So much for well tested!

The authors also compare the metrics of trivial packages with nontrivial packages. We see that the distributions are similar, though nontrivial packages have a greater median, which could easily be due to the size and complexity difference. The authors find that the differences are statistically significant, but with small effect size.

How much effort is needed to keep up with new releases? The most cited drawback for using trivial packages was the extra overhead of needing to keep everything up-to-date.

There are a couple of ways to look at the impact of dependencies. Firstly, the authors compare the number of releases. Trivial packages tend to have fewer releases, so it seems that if you’re going to have a dependency, from a purely maintenance perspective, a trivial dependency is the better option.

The fact that the trivial packages are updated less frequently may be attributed to the fact that trivial packages ‘perform less functionality’, hence they need to be updated less frequently

Next the authors consider how many dependencies (direct and indirect) trivial and nontrivial packages have. Introducing extra dependencies increases the complexity of the dependency chain, so all else being equal, we would prefer to have fewer dependencies.

The authors group packages into four categories by number of dependencies:

0: 56.3% of trivial packages, 34.8% of nontrivial packages
1–10: 27.9% of trivial packages, 30.6% of nontrivial packages
11–20: 4.3% of trivial packages, 7.3% of nontrivial packages
More: 11.5% of trivial packages, 27.3% of nontrivial packages

So developers should beware extra dependencies! Even though the source of a trivial package may be small, it may pull in many additional packages!

Trivial packages have fewer releases and developers are less likely to be version locked than non-trivial packages. That said, developers should be careful when using trivial packages, since in some cases, trivial packages can have numerous dependencies. In fact, we find that 43.7% of trivial packages have at least one dependency and 11.5% of trivial packages have more than 20 dependencies.

The bottom line The final sentence of the paper is short, snappy, and neatly summarises all of what came before:

Hence, developers should be careful about which trivial packages they use.

It probably goes without saying, but I would apply this warning to all packages, trivial and nontrivial.

How to Break an API

2017-11-30T00:00:00Z

By Christopher Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung.
In Foundations of Software Engineering (FSE). 2016.
Paper / Conference / Project

I’ve recently discovered the world of empirical studies of software engineering practices, and like what I see. The few papers I’ve read seem to confirm the conventional wisdom of what “everybody knows”, but it’s nice to see these thoughts backed up by data.

This study looks at three different ecosystems with different approaches to API breakage: the very stable Eclipse Marketplace, the consistent snapshot approach of CRAN, and the semantic versioning approach of npm. An ecosystem is more than a collection of packages, it’s also a group of people, with cultural norms about stability and change.

How, when, and by whom changes are performed in an ecosystem with interdependent packages is subject to (often implicit) negotiation among diverse participants within the ecosystem. Each participant has their own priorities, habits and rhythms, often guided by community-specific values and policies, or even enforced or encouraged by tools. Ecosystems differ in, for example, to what degree they require consistency among packages, how they handle versioning, and whether there are central gatekeepers. Policies and tools are in part designed explicitly, but in part emerge from ad-hoc decisions or from values shared by community members. As a result, community practices may assign burdens of work in ways that create unanticipated conflicts or bottlenecks.

The paper looks at the issue of API breakage from the perspective of both library authors (those doing the breaking) and library users (those who need to modify their code). The results come from a case study of 28 open source developers across the three ecosystems. This doesn’t seem like a lot, but that’s inevitable for survey papers.

Firstly we get an overview of the policies of each ecosystem. They’re very different:

A core value of the Eclipse community is backward compatibility. This value is evident in many policies, such as “API Prime Directive: When evolving the Component API from release to release, do not break existing Clients”.

CRAN pursues snapshot consistency in which the newest version of every package should be compatible with the newest version of every other package in the repository. Older versions are “archived”: available in the repository, but harder to install. […] A core value of the R/CRAN community is to make it easy for end users to install and update packages.

A core value of the Node.js/npm community is to make it easy and fast for developers to publish and use packages. In addition, the community is open to rapid change. […] The focus on convenience for developers (instead of end users) was apparent in our interviews.

Stability. Snapshot consistency. Ease of development. Nobody will use a library that breaks its API every week, but there is clearly a sliding scale of how much breakage is tolerated.

This paper was interesting to me because I’m most familiar with the Hackage and Stackage models, and it didn’t take long for me to see parallels between the Haskell world and other ecosystems. Hackage is more like npm, with the PVP in Haskell serving the role of semver in npm; and Stackage is more like CRAN. The project website has some analysis of Hackage and Stackage, which I think lends credence to this:

Stackage stands out as particularly valuing of compatibility; this is not too surprising since it was formed over as an alternative to Hackage with the specific goal to identify mutually compatible versions of packages to use together.

The reasons for library authors to consider a breaking API change mostly line up with what I would have expected:

Technical debt
Efficiency
Bugs

Funnily enough, fixing bugs isn’t always a good thing for the users:

Throughout our interviews, we heard many examples of how bug fixes effectively broke downstream packages, and the difficulty of knowing in advance which fixes would cause such problems. For example, R7 told us about reimplementing a standard string processing function, and finding that it broke the code of some downstream users that depended on bugs that his tests had not caught. R9 commented on the opportunity cost of not fixing a bug in deference to downstream users’ workarounds for it: “If the [downstream package] is implemented on the workaround for your bug, and then your fix actually breaks the workaround, then you sort of have to have a fallback… [pause] It gets nasty.”

This puts me in mind of Microsoft, who are famous for never breaking backwards compatibility and just introducing new APIs when they have a better way of doing something. I wouldn’t want to maintain their behemoth of a codebase!

Library authors don’t like to break things for their users, but for CRAN package authors this is perhaps a greater concern than usual:

Two interviewees (E1 and R4) specifically mentioned concern for downstream users’ scientific research (R4: “We’re improving the method, but results might change, so that’s also worrying — it makes it hard to do reproducible research”).

But some library authors don’t care so much:

Only a few developers were not particularly worried about breaking changes. Some (E6, N1, N5) had strong ties to their users and felt they could help them individually (N5: “We try to avoid breaking their code — but it’s easy to update their code”). Interviewee N6 expressed an “out of sight, out of mind” attitude: “Unfortunately, if someone suffers and then silently does not know how to reach me or contact me or something, yeah that’s bad but that suffering person is sort of [the tree] in the woods that falls and doesn’t make a sound.”

It’s perhaps worth mentioning at this point that the “N” people are npm users. The attitude of N6 would be fairly typical of Hackage users too, I feel.

Now the paper crosses over to the other side, and looks at library users and how they react to dependency changes. It’s the same people as in the first survey, so these are library users who are also library authors. I wonder if a survey of people who are primarily application authors would be different here. There are three approaches to learning about new library releases:

Actively monitoring dependencies. Most people don’t do this.
Having a general social awareness of the field, such as by following people on Twitter.
Reactively waiting for notifications. Most people do this.

A common strategy to handling the constant barrage of library updates is to be more careful about what you depend on.

Interviewee E5 represents a common view: “I only depend on things that are really worthwhile. Because basically everything that you depend on is going to give you pain every so often. And that’s inevitable.”

Developers use a number of factors to decide if a dependency is worth it:

How much they trust the authors
How actively developed it is
The size of its user base
What the authors’ historic approach to breakage has been

The paper now mentions as surprising something which I completely expected:

Interestingly, there was almost no mention of traditional encapsulation strategies to isolate the impact of changes to upstream modules, contra to our expectations and typical software-engineering teaching. Only N6 mentioned developing an abstraction layer between his package and an upstream dependency

I don’t think I’ve seen a project introduce a layer of abstraction between a dependency and its use, except in cases where one of multiple dependencies will be used (like using one out of several database libraries, but providing a consistent interface). Maybe this would be a good idea sometimes, but I feel like in most situations it’s just adding extra complexity and maintenance burden for little benefit.

The paper wraps up with some discussion of the tension between policies, values, and practice:

For example there is a tension in Eclipse between the policy and practice of semantic versioning. Eclipse has a long-standing versioning policy similar to semantic versioning and the platform’s stability is reflected in the fact that many packages have not changed their major version number in over 10 years. However, even for the few cases of breaking changes that are clearly documented in the release notes, such as removing deprecated functions, major versions are often not increased, because, as E8 told us, updating a major version number can ripple version updates to downstream packages, and can entail significant work for the downstream projects.

This is something I struggle with as a library user in Haskell: if I change the version bounds on one of my dependencies, how exactly does that translate into a version change for me? Sometimes it’s not so clear.

So, to conclude:

How to break an API: In Eclipse, you don’t. In R/CRAN, you reach out to affected downstream developers. In Node.js/npm, you increase the major version number.