How to Break an API

By Chris­topher Bogart, Chris­tian Kästner, James Herbsleb, and Fer­dian Thung.
In Found­a­tions of Soft­ware En­gin­eering (F­SE). 2016.
Paper / Con­fer­ence / Pro­ject

I’ve re­cently dis­covered the world of em­pir­ical studies of soft­ware en­gin­eering prac­tices, and like what I see. The few pa­pers I’ve read seem to con­firm the con­ven­tional wisdom of what “every­body knows”, but it’s nice to see these thoughts backed up by data.

This study looks at three dif­ferent eco­sys­tems with dif­ferent ap­proaches to API break­age: the very stable Ec­lipse Mar­ket­place, the con­sistent snap­shot ap­proach of CRAN, and the se­mantic ver­sioning ap­proach of npm. An eco­system is more than a col­lec­tion of pack­ages, it’s also a group of people, with cul­tural norms about sta­bility and change.

How, when, and by whom changes are per­formed in an eco­system with in­ter­de­pendent pack­ages is sub­ject to (often im­pli­cit) ne­go­ti­ation among di­verse par­ti­cipants within the eco­sys­tem. Each par­ti­cipant has their own pri­or­it­ies, habits and rhythms, often guided by com­munity-spe­cific values and policies, or even en­forced or en­cour­aged by tools. Eco­sys­tems differ in, for ex­ample, to what de­gree they re­quire con­sist­ency among pack­ages, how they handle ver­sion­ing, and whether there are central gate­keep­ers. Policies and tools are in part de­signed ex­pli­citly, but in part emerge from ad-hoc de­cisions or from values shared by com­munity mem­bers. As a res­ult, com­munity prac­tices may as­sign bur­dens of work in ways that create unanti­cip­ated con­flicts or bot­tle­necks.

The paper looks at the issue of API breakage from the per­spective of both lib­rary au­thors (those doing the break­ing) and lib­rary users (those who need to modify their code). The res­ults come from a case study of 28 open source de­velopers across the three eco­sys­tems. This doesn’t seem like a lot, but that’s in­ev­it­able for survey pa­pers.

Firstly we get an over­view of the policies of each eco­sys­tem. They’re very dif­fer­ent:

A core value of the Ec­lipse com­munity is back­ward com­pat­ib­il­ity. This value is evident in many policies, such as “API Prime Dir­ect­ive: When evolving the Com­ponent API from re­lease to re­lease, do not break ex­isting Cli­ents”.

CRAN pur­sues snap­shot con­sist­ency in which the newest ver­sion of every package should be com­pat­ible with the newest ver­sion of every other package in the re­pos­it­ory. Older ver­sions are “archived”: avail­able in the re­pos­it­ory, but harder to in­stall. […] A core value of the R/CRAN com­munity is to make it easy for end users to in­stall and up­date pack­ages.

A core value of the Node.js/npm com­munity is to make it easy and fast for de­velopers to pub­lish and use pack­ages. In ad­di­tion, the com­munity is open to rapid change. […] The focus on con­veni­ence for de­velopers (in­stead of end users) was ap­parent in our in­ter­views.

Sta­bil­ity. Snap­shot con­sist­ency. Ease of de­vel­op­ment. Nobody will use a lib­rary that breaks its API every week, but there is clearly a sliding scale of how much breakage is tol­er­ated.

This paper was in­ter­esting to me be­cause I’m most fa­miliar with the Hackage and Stackage mod­els, and it didn’t take long for me to see par­al­lels between the Haskell world and other eco­sys­tems. Hackage is more like npm, with the PVP in Haskell serving the role of semver in npm; and Stackage is more like CRAN. The pro­ject web­site has some ana­lysis of Hackage and Stack­age, which I think lends cre­dence to this:

Stackage stands out as par­tic­u­larly valuing of com­pat­ib­il­ity; this is not too sur­prising since it was formed over as an al­tern­ative to Hackage with the spe­cific goal to identify mu­tu­ally com­pat­ible ver­sions of pack­ages to use to­gether.

The reasons for lib­rary au­thors to con­sider a breaking API change mostly line up with what I would have ex­pec­ted:

  • Tech­nical debt
  • Ef­fi­ciency
  • Bugs

Fun­nily enough, fixing bugs isn’t al­ways a good thing for the users:

Throughout our in­ter­views, we heard many ex­amples of how bug fixes ef­fect­ively broke down­stream pack­ages, and the dif­fi­culty of knowing in ad­vance which fixes would cause such prob­lems. For ex­ample, R7 told us about re­im­ple­menting a standard string pro­cessing func­tion, and finding that it broke the code of some down­stream users that de­pended on bugs that his tests had not caught. R9 com­mented on the op­por­tunity cost of not fixing a bug in de­fer­ence to down­stream users’ work­arounds for it: “If the [down­stream pack­age] is im­ple­mented on the work­around for your bug, and then your fix ac­tu­ally breaks the work­around, then you sort of have to have a fall­back… [pause] It gets nasty.”

This puts me in mind of Mi­crosoft, who are famous for never breaking back­wards com­pat­ib­ility and just in­tro­du­cing new APIs when they have a better way of doing something. I wouldn’t want to main­tain their be­hemoth of a code­base!

Lib­rary au­thors don’t like to break things for their users, but for CRAN package au­thors this is per­haps a greater con­cern than usual:

Two in­ter­viewees (E1 and R4) spe­cific­ally men­tioned con­cern for down­stream users’ sci­entific re­search (R4: “We’re im­proving the method, but res­ults might change, so that’s also wor­rying — it makes it hard to do re­pro­du­cible re­search”).

But some lib­rary au­thors don’t care so much:

Only a few de­velopers were not par­tic­u­larly wor­ried about breaking changes. Some (E6, N1, N5) had strong ties to their users and felt they could help them in­di­vidu­ally (N5: “We try to avoid breaking their code — but it’s easy to up­date their code”). In­ter­viewee N6 ex­pressed an “out of sight, out of mind” at­ti­tude: “Un­for­tu­nately, if someone suf­fers and then si­lently does not know how to reach me or con­tact me or something, yeah that’s bad but that suf­fering person is sort of [the tree] in the woods that falls and doesn’t make a sound.”

It’s per­haps worth men­tioning at this point that the “N” people are npm users. The at­ti­tude of N6 would be fairly typ­ical of Hackage users too, I feel.

Now the paper crosses over to the other side, and looks at lib­rary users and how they react to de­pend­ency changes. It’s the same people as in the first sur­vey, so these are lib­rary users who are also lib­rary au­thors. I wonder if a survey of people who are primarily ap­plic­a­tion au­thors would be dif­ferent here. There are three ap­proaches to learning about new lib­rary re­leases:

  • Act­ively mon­it­oring de­pend­en­cies. Most people don’t do this.
  • Having a gen­eral so­cial aware­ness of the field, such as by fol­lowing people on Twit­ter.
  • Re­act­ively waiting for no­ti­fic­a­tions. Most people do this.

A common strategy to hand­ling the con­stant bar­rage of lib­rary up­dates is to be more careful about what you de­pend on.

In­ter­viewee E5 rep­res­ents a common view: “I only de­pend on things that are really worth­while. Be­cause ba­sic­ally everything that you de­pend on is going to give you pain every so of­ten. And that’s in­ev­it­able.”

De­velopers use a number of factors to de­cide if a de­pend­ency is worth it:

  • How much they trust the au­thors
  • How act­ively de­veloped it is
  • The size of its user base
  • What the au­thors’ his­toric ap­proach to breakage has been

The paper now men­tions as sur­prising some­thing which I com­pletely ex­pec­ted:

In­ter­est­ingly, there was al­most no men­tion of tra­di­tional en­cap­su­la­tion strategies to isolate the im­pact of changes to up­stream mod­ules, contra to our ex­pect­a­tions and typ­ical soft­ware-en­gin­eering teach­ing. Only N6 men­tioned de­vel­oping an ab­strac­tion layer between his package and an up­stream de­pend­ency

I don’t think I’ve seen a pro­ject in­tro­duce a layer of ab­strac­tion between a de­pend­ency and its use, ex­cept in cases where one of mul­tiple de­pend­en­cies will be used (like using one out of sev­eral data­base lib­rar­ies, but providing a con­sistent in­ter­face). Maybe this would be a good idea some­times, but I feel like in most situ­ations it’s just adding extra com­plexity and main­ten­ance burden for little be­ne­fit.

The paper wraps up with some dis­cus­sion of the ten­sion between policies, val­ues, and prac­tice:

For ex­ample there is a ten­sion in Ec­lipse between the policy and prac­tice of se­mantic ver­sion­ing. Ec­lipse has a long-standing ver­sioning policy sim­ilar to se­mantic ver­sioning and the plat­form’s sta­bility is re­flected in the fact that many pack­ages have not changed their major ver­sion number in over 10 years. However, even for the few cases of breaking changes that are clearly doc­u­mented in the re­lease notes, such as re­moving de­prec­ated func­tions, major ver­sions are often not in­creased, be­cause, as E8 told us, up­dating a major ver­sion number can ripple ver­sion up­dates to down­stream pack­ages, and can en­tail sig­ni­ficant work for the down­stream pro­jects.

This is some­thing I struggle with as a lib­rary user in Haskell: if I change the ver­sion bounds on one of my de­pend­en­cies, how ex­actly does that trans­late into a ver­sion change for me? Some­times it’s not so clear.

So, to con­clude:

How to break an API: In Ec­lipse, you don’t. In R/CRAN, you reach out to af­fected down­stream de­velopers. In Node.js/npm, you in­crease the major ver­sion num­ber.

esec, fse, paper summary, research
Target Audience
Computer science people.