<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>barrucadu&#39;s memos - Research</title>
  <link href="https://memo.barrucadu.co.uk/taxon/research.xml" rel="self" />
  <link href="https://memo.barrucadu.co.uk/" />
  <id>https://memo.barrucadu.co.uk/taxon/research.xml</id>
  <author>
    <name>Michael Walker</name>
    <email>mike@barrucadu.co.uk</email>
  </author>
  
  <updated>2021-05-30T00:00:00Z</updated>
  
  
  <entry>
    <title>It&#39;s not a no-op to unmask an interruptible operation (and dejafu detects this)</title>
    <link href="https://memo.barrucadu.co.uk/restore-interruptible.html" />
    <id>https://memo.barrucadu.co.uk/restore-interruptible.html</id>
    <published>2021-05-30T00:00:00Z</published>
    <updated>2021-05-30T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>User effectfully on reddit wrote an article <a href="https://github.com/effectfully-ou/sketches/tree/master/restore-interruptible">It’s not a no-op to unmask an interruptible operation</a> (<a href="https://old.reddit.com/r/haskell/comments/nntfui/its_not_a_noop_to_unmask_an_interruptible/">reddit discussion</a>) about a small gotcha with interruptible operations and asynchronous exceptions.</p>
<p>The gist of it is that this snippet of code:</p>
<pre class="haskell"><code>mask $ \restore -&gt; do
  putMVar var x
  ...</code></pre>
<p>behaves differently to this snippet of code:</p>
<pre class="haskell"><code>mask $ \restore -&gt; do
  restore $ putMVar var x
  ...</code></pre>
<p>in the presence of asynchronous exceptions. The post goes on to explain what the different behaviours are and why they crop up; but thinking about concurrency is too much like effort, let’s turn to <a href="http://hackage.haskell.org/package/dejafu">dejafu</a>!</p>
<h2 id="no-restore-around-the-put">No restore around the put</h2>
<p>In this test case, I want to see</p>
<ol type="1">
<li>if the <code>putMVar var x</code> is interrupted by an asynchronous exception; and</li>
<li>if the <code>...</code> bit of code gets executed</li>
</ol>
<p>So the actual test case is a bit more complex than just the snippet above. We’re going to need three threads:</p>
<pre class="haskell"><code>thread1 = mask $ \restore -&gt; catch
  (putMVar var &quot;hello world&quot; &gt;&gt; putMVar success True)
  (\(_ :: SomeException) -&gt; putMVar success False)

thread2 = putMVar var &quot;interrupted!&quot;

thread3 = killThread thread1</code></pre>
<p>Putting it together into an actual test case, we get:</p>
<pre class="haskell"><code>import Control.Concurrent.Classy
import Control.Exception (SomeException)

example1 :: MonadConc m =&gt; m (String, Bool)
example1 = do
  var &lt;- newEmptyMVar
  success &lt;- newEmptyMVar
  interruptMe &lt;- newEmptyMVar

  tid &lt;- fork $ mask $ \_ -&gt; do
    putMVar interruptMe ()
    catch
      (putMVar var &quot;hello world&quot; &gt;&gt; putMVar success True)
      (\(_ :: SomeException) -&gt; putMVar success False)

  -- wait for the thread to be inside the `mask`, then fork a thread
  -- to race on the `putMVar` and also throw an async exception.
  takeMVar interruptMe
  _ &lt;- fork $ putMVar var &quot;interrupted!&quot;
  killThread tid

  (,) &lt;$&gt; readMVar var &lt;*&gt; readMVar success</code></pre>
<p>There’s a little extra ceremony involved in making sure that the race happens <em>after</em> the <code>mask</code>—we need a new <code>interruptMe</code> <code>MVar</code>—but other than that it’s fairly straightforward.</p>
<p>dejafu finds two behaviours for this example, and gives abbreviated execution traces:</p>
<pre><code>&gt; autocheck example1
[pass] Successful
[fail] Deterministic
    (&quot;hello world&quot;,True) S0-----S1--------S0------

    (&quot;interrupted!&quot;,False) S0-----S1---P0---S2--S1-S0---S1---S0--
False</code></pre>
<h2 id="do-restore-around-the-put">Do restore around the put</h2>
<p>Here’s our new test case:</p>
<pre class="haskell"><code>import Control.Concurrent.Classy
import Control.Exception (SomeException)

example2 :: MonadConc m =&gt; m (String, Bool)
example2 = do
  interruptMe &lt;- newEmptyMVar
  var &lt;- newEmptyMVar
  success &lt;- newEmptyMVar

  tid &lt;- fork $ mask $ \restore -&gt; do
    putMVar interruptMe ()
    catch
      (restore (putMVar var &quot;hello world&quot;) &gt;&gt; putMVar success True)
      (\(_ :: SomeException) -&gt; putMVar success False)

  -- wait for the thread to be inside the `mask`, then fork a thread
  -- to race on the `putMVar` and also throw an async exception.
  takeMVar interruptMe
  _ &lt;- fork $ putMVar var &quot;interrupted!&quot;
  killThread tid

  (,) &lt;$&gt; readMVar var &lt;*&gt; readMVar success</code></pre>
<p>Lo and behold, dejafu finds a <em>third</em> behaviour:</p>
<pre><code>&gt; autocheck example2
[pass] Successful
[fail] Deterministic
    (&quot;hello world&quot;,True) S0-----S1-----------S0------

    (&quot;hello world&quot;,False) S0-----S1-----P0-----S1---S0--

    (&quot;interrupted!&quot;,False) S0-----S1----P0----S1---S2--S0---
False</code></pre>
<p>So it seems that we can now end up in the situation where the <code>putMVar var "hello world"</code> does happen, but <em>after</em> writing to the <code>MVar</code> the asynchronous exception is delivered and so we hit the <code>putMVar success False</code> case.</p>
<p>Weird, right?</p>
<h2 id="whats-the-difference">What’s the difference?</h2>
<p>We can get the actual execution trace for the new case with a lower-level function in dejafu, <code>runSCT</code>. Digging through it, we can find the pre-emption of thread 1 (the first thread forked) by thread 0 (the main thread):</p>
<pre class="haskell"><code>(SwitchTo main, [(1, WillResetMasking True MaskedInterruptible)], TakeMVar 1 [])</code></pre>
<p>This says that we switched to the main thread, and it performed a <code>takeMVar</code> operation. And furthermore, that thread 1 <em>will next</em> reset the masking state back to <code>MaskedInterruptible</code>.</p>
<p>Now the issue becomes clear. The problematic snippet:</p>
<pre class="haskell"><code>mask $ \restore -&gt; do
  restore $ putMVar var x
  ...</code></pre>
<p>Actually means to perform these steps:</p>
<ol type="1">
<li>Change the masking state to <code>MaskedInterruptible</code></li>
<li>Change the masking state to <code>Unmasked</code></li>
<li>Do <code>putMVar var x</code></li>
<li>Reset the masking state back to <code>MaskedInterruptible</code></li>
<li>Do <code>...</code></li>
</ol>
<p>The issue is that completing the <code>putMVar var x</code> call and resetting the masking state are <em>two</em> operations. That’s not atomic. So there is a chance that an exception can be delivered between them.</p>
<p>And that’s the issue explained in <a href="https://github.com/effectfully-ou/sketches/tree/master/restore-interruptible">It’s not a no-op to unmask an interruptible operation</a>, replicated with dejafu.</p>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Interesting Research</title>
    <link href="https://memo.barrucadu.co.uk/interesting-research.html" />
    <id>https://memo.barrucadu.co.uk/interesting-research.html</id>
    <published>2017-11-20T00:00:00Z</published>
    <updated>2020-06-24T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<aside class="highlight">
This is out of date and in need of review.
</aside>
<h2 id="sources">Sources</h2>
<h3 id="groups">Groups</h3>
<ul>
<li><a href="http://www.sigact.org">ACM Special Interest Group on Algorithms and Computation Theory</a> (ACM SIGACT)</li>
<li><a href="http://www.sigplan.org">ACM Special Interest Group on Programming Languages</a> (ACM SIGPLAN)</li>
<li><a href="http://www.sigsoft.org">ACM Special Interest Group on Software Engineering</a> (ACM SIGSOFT)</li>
<li><a href="http://www.cs.ox.ac.uk/ralf.hinze/WG2.8">IFIP Working Group 2.8</a></li>
<li><a href="https://www.microsoft.com/en-us/research/group/research-in-software-engineering-rise/">Research in Software Engineering</a> (RiSE)</li>
</ul>
<h3 id="journals">Journals</h3>
<ul>
<li><a href="https://www.cambridge.org/core/journals/journal-of-functional-programming">Journal of Functional Programming</a> (JFP)</li>
<li><a href="http://topc.acm.org">Transactions on Parallel Computing</a> (TOPC)</li>
</ul>
<h3 id="events">Events</h3>
<ul>
<li><a href="http://splashcon.org">Conference on Systems, Programming, Languages and Applications: Software for Humanity</a> (SPLASH)</li>
<li><a href="http://conf.researchr.org/series/pldi">Conference on Programming Language Design and Implementation</a> (PLDI)</li>
<li><a href="https://www.dagstuhl.de">Dagstuhl</a></li>
<li><a href="http://ase-conferences.org">International Conference on Automated Software Engineering</a> (ASE)</li>
<li><a href="http://www.icfpconference.org">International Conference on Functional Programming</a> (ICFP)
<ul>
<li><a href="http://cufp.org">Commercial Users of Functional Programming</a> (CUFP)</li>
<li><a href="https://wiki.haskell.org/HaskellImplementorsWorkshop">Haskell Implementors Workshop</a> (HIW)</li>
<li><a href="https://www.haskell.org/haskell-symposium">Haskell Symposium / Symposium on Haskell</a> (Haskell)</li>
</ul></li>
<li><a href="https://runtime-verification.github.io">International Conference on Runtime Verification</a> (RV)</li>
<li><a href="http://www.icse-conferences.org">International Conference on Software Engineering</a> (ICSE)</li>
<li><a href="https://sites.google.com/site/ictssmain/home">International Conference on Testing Software and Systems</a> (ICTSS)</li>
<li><a href="https://tacas.info/">International Conference on Tools and Algorithms for the Construction and Analysis of Systems</a> (TACAS)</li>
<li><a href="http://www.disc-conference.org/wp">International Symposium on Distributed Computing</a> (DISC)</li>
<li><a href="http://conf.researchr.org/series/ismm">International Symposium on Memory Management</a> (ISMM)</li>
<li><a href="http://conf.researchr.org/series/issta">International Symposium on Software Testing and Analysis</a> (ISSTA)</li>
<li><a href="http://www.esec-fse.org">Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering</a> (ESEC/FSE)</li>
<li><a href="http://conf.researchr.org/series/PPoPP">Principles and Practice of Parallel Programming</a> (PPoPP)</li>
<li><a href="http://www.podc.org">Symposium on Principles of Distributed Computing</a> (PODC)</li>
<li><a href="http://conf.researchr.org/series/POPL">Symposium on Principles of Programming Languages</a> (POPL)</li>
</ul>
<h3 id="mailing-lists">Mailing lists</h3>
<ul>
<li><a href="https://mail.haskell.org/cgi-bin/mailman/listinfo/haskell">Haskell</a></li>
<li><a href="https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=SREPLS">SREPLS</a></li>
</ul>
<h3 id="blogs">Blogs</h3>
<ul>
<li><a href="https://blog.acolyer.org/">The Morning Paper</a></li>
</ul>
<h2 id="topics">Topics</h2>
<h3 id="systematic-concurrency-testing-sct">Systematic concurrency testing (SCT)</h3>
<p>Deterministic testing for concurrent programs, by controlling the scheduling decisions made to intelligently explore the state space. Can be complete or incomplete. Draws from model checking and program verification.</p>
<h4 id="people">People</h4>
<ul>
<li><strong><a href="http://multicore.doc.ic.ac.uk/people/ally-donaldson/">Alastair F. Donaldson</a></strong><a href="interesting-research.html#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a></li>
<li><strong><a href="http://www-bcf.usc.edu/~wang626">Chao Wang</a></strong></li>
<li><strong><a href="https://parasol.tamu.edu/~jeff/">Jeff Huang</a></strong></li>
<li><strong><a href="https://scholar.google.co.uk/citations?hl=en&amp;user=ijCSV_wAAAAJ&amp;view_op=list_works&amp;sortby=pubdate">Konstantinos Sagonas</a></strong></li>
<li><strong><a href="https://people.eecs.berkeley.edu/~ksen">Koushik Sen</a></strong></li>
<li><strong><a href="https://www.microsoft.com/en-us/research/people/madanm">Madanlal (Madan) Musuvathi</a></strong></li>
<li><strong><a href="http://research.microsoft.com/en-us/um/people/pg">Patrice Godefroid</a></strong></li>
<li><strong><a href="https://www.microsoft.com/en-us/research/people/sburckha">Sebastian Burckhardt</a></strong></li>
<li><strong><a href="https://www.microsoft.com/en-us/research/people/qadeer">Shaz Qadeer</a></strong></li>
<li><a href="http://www.doc.ic.ac.uk/~abetts/">Adam Betts</a></li>
<li><a href="https://www.microsoft.com/en-us/research/people/akashl">Akash Lal</a></li>
<li><a href="http://www.soundandcomplete.org/">Azalea Raad</a></li>
<li><a href="http://user.it.uu.se/~bengt/">Bengt Jonsson</a></li>
<li><a href="https://dl.acm.org/profile/87958953157">Burcu Kulahcioglu Ozkan</a></li>
<li><a href="http://research.microsoft.com/en-us/um/people/chengh">Cheng Huang</a></li>
<li><a href="http://multicore.doc.ic.ac.uk/people/christopher-lidbury/">Christopher Lidbury</a></li>
<li><a href="https://www-users.cs.york.ac.uk/colin">Colin Runciman</a></li>
<li><a href="https://users.soe.ucsc.edu/~cormac">Cormac Flanagan</a></li>
<li><a href="https://people.eecs.berkeley.edu/~necula/">George Necula</a></li>
<li><a href="https://www.burn.im/">Jacob Burnim</a></li>
<li><a href="http://www.ketema.eu">Jeroen Ketema</a></li>
<li>John Erickson</li>
<li><a href="https://dl.acm.org/profile/81381599031">Katherine E. Coons</a></li>
<li><a href="http://www.cs.utexas.edu/users/mckinley">Kathryn S. McKinley</a></li>
<li>Magnus Lång</li>
<li>Mahmoud Abdelrasoul</li>
<li><a href="https://markus-kusano.github.io">Markus Kusano</a></li>
<li><a href="http://web.mit.edu/rmccutch/www">Matt McCutchen</a></li>
<li><a href="http://michael-emmi.github.io">Michael Emmi</a></li>
<li>Michalis Kokologiannakis</li>
<li><a href="https://www.cis.upenn.edu/~milom/">Milo M. K. Martin</a></li>
<li><a href="http://www.it.uu.se/katalog/mohat117">Mohamed Faouzi Atig</a></li>
<li><a href="https://sites.google.com/site/nalingzhang/vt">Naling Zhang</a></li>
<li><a href="http://pdeligia.github.io">Pantazis Deligiannis</a></li>
<li><a href="http://user.it.uu.se/~parosh/">Parosh Aziz Abdulla</a></li>
<li><a href="http://www.doc.ic.ac.uk/~pt1110/">Paul Thomson</a></li>
<li><a href="http://www.cs.princeton.edu/~kothari">Pravesh Kothari</a></li>
<li><a href="https://sites.google.com/view/rashmi/home">Rashmi Mudduluru</a></li>
<li><a href="https://people.mpi-sws.org/~rupak/">Rupak Majumdar</a></li>
<li><a href="https://www.cs.rutgers.edu/~santosh.nagarakatte">Santosh Nagarakatte</a></li>
<li><a href="https://www.microsoft.com/en-us/research/people/shuochen">Shuo Chen</a></li>
<li>Simin Oraee</li>
<li>Stavros Aronis</li>
<li><a href="https://sites.google.com/site/tayfunelmas/">Tayfun Elmas</a></li>
<li><a href="https://phongngo.github.io/">Tuan Phong Ngo</a></li>
<li><a href="https://dblp.uni-trier.de/pers/v/Vafeiadis:Viktor.html">Viktor Vafeiadis</a></li>
<li><a href="https://www.microsoft.com/en-us/research/people/schulte">Wolfram Schulte</a></li>
<li><a href="http://www.zvonimir.info">Zvonimir Rakamaric</a></li>
</ul>
<h4 id="papers">Papers</h4>
<ul>
<li><p><strong>Partial-Order Methods for the Verification of Concurrent Systems: An Approach to the State-Explosion Problem</strong> (<a href="http://research.microsoft.com/en-us/um/people/pg/public_psfiles/thesis.ps">ps</a>)<br> Patrice Godefroid.<br> PhD thesis, 1996.</p></li>
<li><p><strong>Dynamic partial-order reduction for model checking software</strong> (<a href="https://users.soe.ucsc.edu/~cormac/papers/popl05.pdf">pdf</a>)<br> Cormac Flanagan and Patrice Godefroid.<br> In <em>Symposium on Principles of Programming Languages</em> (POPL). 2005.</p></li>
<li><p><strong>Iterative Context Bounding for Systematic Testing of Multithreaded Programs</strong> (<a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/chess-pldi07-iterativecontextbounding.pdf">pdf</a>)<br> Madanlal Musuvathi and Shaz Qadeer.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2007.</p></li>
<li><p><strong>Effective Random Testing of Concurrent Programs</strong> (<a href="https://people.eecs.berkeley.edu/~ksen/papers/fuzzpar.pdf">pdf</a>)<br> Koushik Sen.<br> In <em>International Conference on Automated Software Engineering</em> (ASE). 2007.</p></li>
<li><p><strong>Fair Stateless Model Checking</strong> (<a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/pldi08-FairStatelessModelChecking.pdf">pdf</a>)<br> Madanlal Musuvathi and Shaz Qadeer.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2008.</p></li>
<li><p><strong>Race Directed Random Testing of Concurrent Programs</strong> (<a href="https://www.cs.columbia.edu/~junfeng/10fa-e6998/papers/racefuzz.pdf">pdf</a>)<br> Koushik Sen.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2008.</p></li>
<li><p><strong>A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs</strong> (<a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/paper-83.pdf">pdf</a>)<br> Sebastian Burckhardt, Pravesh Kothari, Madanlal Musuvathi, and Santosh Nagarakatte.<br> In <em>International Conference on Architectural Support for Programming Languages and Operating Systems</em> (ASPLOS). 2010.</p></li>
<li><p><strong>Delay-bounded Scheduling</strong> (<a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/msr-tr-2010-123.pdf">pdf</a>)<br> Michael Emmi, Shaz Qadeer, and Zvonimir Rakamaric.<br> In <em>Symposium on Principles of Programming Languages</em> (POPL). 2011.</p></li>
<li><p><strong>Multicore Acceleration of Priority-Based Schedulers for Concurrency Bug Detection</strong> (<a href="https://dl.acm.org/doi/10.1145/2345156.2254128">pdf</a>)<br> Santosh Nagarakatte, Sebastian Burckhardt, Milo M. K. Martin, and Madanlal Musuvathi.<br> In <em>ACM SIGPLAN Notices</em>. June 2012.</p></li>
<li><p><strong>Bounded Partial-order Reduction</strong> (<a href="http://www.cs.utexas.edu/users/mckinley/papers/bpor-oopsla-2013.pdf">pdf</a>)<br> Katherine E. Coons, Madan Musuvathi, and Kathryn S. McKinley.<br> In <em>International Conference on Object Oriented Programming Systems, Languages &amp; Applications</em> (OOPSLA). 2013.</p></li>
<li><p><strong>CONCURRIT: A Domain Specific Language for Reproducing Concurrency Bugs</strong> (<a href="https://people.eecs.berkeley.edu/~ksen/papers/concurrit.pdf">pdf</a>)<br> Tayfun Elmas, Jacob Burnim, George Necula, Koushik Sen.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2013.</p></li>
<li><p><strong>Concurrency Testing Using Schedule Bounding: an Empirical Study</strong> (<a href="http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2014/PPoPP.pdf">pdf</a>)<br> Paul Thomson, Alastair F. Donaldson, and Adam Betts.<br> In <em>Symposium on Principles and Practice of Parallel Programming</em> (PPoPP). 2014.</p></li>
<li><p><strong>Dynamic Partial Order Reduction for Relaxed Memory Models</strong> (<a href="http://www-bcf.usc.edu/~wang626/pubDOC/ZhangKW15.pdf">pdf</a>)<br> Naling Zhang, Markus Kusano, and Chao Wang.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI 2015). 2015.</p></li>
<li><p><strong>Asynchronous Programming, Analysis and Testing with State Machines</strong> (<a href="http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2015/PLDI_PSharp.pdf">pdf</a>)<br> Pantazis Deligiannis, Alastair F. Donaldson, Jeroen Ketema, Akash Lal, and Paul Thomson.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2015.</p></li>
<li><p><strong>Stateless Model Checking Concurrent Programs with Maximal Causality Reduction</strong> (<a href="https://parasol.tamu.edu/~jeff/academic/mcr.pdf">pdf</a>)<br> Jeff Huang.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2015.</p></li>
<li><p><strong>Concurrency Testing Using Controlled Schedulers: An Empirical Study</strong> (<a href="http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2016/TOPC.pdf">pdf</a>)<br> Paul Thomson, Alastair F. Donaldson, and Adam Betts.<br> In <em>Transactions on Parallel Computing</em> (TOPC). 2016.</p></li>
<li><p><strong>Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)</strong> (<a href="http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2016/FAST.pdf">pdf</a>)<br> Pantazis Deligiannis, Matt McCutchen, Paul Thomson, Shuo Chen, Alastair F. Donaldson, John Erickson, Cheng Huang, Akash Lal, Rashmi Mudduluru, Shaz Qadeer, and Wolfram Schulte.<br> In <em>Conference on File and Storage Technologies</em> (FAST). 2016.</p></li>
<li><p><strong>Promoting Secondary Orders of Event Pairs in Randomized Scheduling using a Randomized Stride</strong> (<a href="http://www4.ncsu.edu/~maabdelf/randomizedstride/paper.pdf">pdf</a>)<br> Mahmoud Abdelrasoul.<br> In <em>Conference on Automated Software Engineering</em> (ASE). 2017.</p></li>
<li><p><strong>Optimal Dynamic Partial Order Reduction with Observers</strong> (<a href="https://www.oapen.org/download?type=document&amp;docid=1002306">pdf</a>)<br> Stavros Aronis, Bengt Jonsson, Magnus Lång, and Konstantinos Sagonas.<br> In <em>International Conference on Tools and Algorithms for the Construction and Analysis of Systems</em> (TACAS). 2018.</p></li>
<li><p><strong>Optimal Stateless Model Checking under the Release-Acquire Semantics</strong> (<a href="http://user.it.uu.se/~bengt/Papers/Full/oopsla18.pdf">pdf</a>)<br> Parosh Aziz Abdulla, Mohamed Faouzi Atig, Bengt Jonsson, Tuan Phong Ngo.<br> In <em>International Conference on Object Oriented Programming Systems, Languages &amp; Applications</em> (OOPSLA). 2018.</p></li>
<li><p><strong>Effective lock handling in stateless model checking</strong> (<a href="https://dl.acm.org/doi/10.1145/3360599">pdf</a>) Michalis Kokologiannakis, Azalea Raad, and Viktor Vafeiadis.<br> In <em>International Conference on Object Oriented Programming Systems, Languages &amp; Applications</em> (OOPSLA). 2019.</p></li>
<li><p><strong>Sparse record and replay with controlled scheduling</strong> (<a href="https://dl.acm.org/doi/10.1145/3314221.3314635">pdf</a>)<br> Christopher Lidbury and Alastair F Donaldson.<br> In <em>Conference on Programming Language Design and Implementation</em> (PLDI). 2019.</p></li>
<li><p><strong>Trace aware random testing for distributed systems</strong> (<a href="https://dl.acm.org/doi/10.1145/3360606">pdf</a>) Burcu Kulahcioglu Ozkan, Rupak Majumdar, and Simin Oraee.<br> In <em>International Conference on Object Oriented Programming Systems, Languages &amp; Applications</em> (OOPSLA). 2019.</p></li>
</ul>
<h4 id="my-papers">My papers</h4>
<ul>
<li><strong>Déjà Fu: A Concurrency Testing Library for Haskell</strong> (<a href="https://www.barrucadu.co.uk/publications/dejafu-hs15.pdf">pdf</a>)<br> Michael Walker and Colin Runciman.<br> In <em>Symposium on Haskell</em> (Haskell). 2015.</li>
</ul>
<h4 id="venues">Venues</h4>
<ul>
<li>Conference on Automated Software Engineering (ASE)</li>
<li>Conference on File and Storage Technologies (FAST)</li>
<li>Conference on Programming Language Design and Implementation (PLDI)</li>
<li>International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)</li>
<li>International Conference on Automated Software Engineering (ASE)</li>
<li>International Conference on Object Oriented Programming Systems, Languages &amp; Applications (OOPSLA, now SPLASH)</li>
<li>Symposium on Principles and Practice of Parallel Programming (PPoPP)</li>
<li>Symposium on Principles of Programming Languages (POPL)</li>
<li>Transactions on Parallel Computing (TOPC)</li>
<li>International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS)</li>
</ul>
<h3 id="test-case-generation">Test case generation</h3>
<p>Test cases are hard to write by hand, so rather than do that, have a tool attempt to discover interesting ones. By reading the output, a programmer can (a) add good tests to the testsuite; and (b) spot potential issues when expected tests don’t show up, or when unexpected ones do.</p>
<h4 id="people-1">People</h4>
<ul>
<li><strong><a href="http://www.cs.cmu.edu/~agroce">Alex Groce</a></strong></li>
<li><strong><a href="https://www.drmaciver.com/">David R. MacIver</a></strong></li>
<li><strong><a href="http://www.cs.utah.edu/~regehr">John Regehr</a></strong></li>
<li><strong><a href="http://www.cse.chalmers.se/~koen/">Koen Claessen</a></strong></li>
<li><strong><a href="http://www.cse.chalmers.se/~nicsma/">Nicholas Smallbone</a></strong></li>
<li><strong><a href="https://matela.com.br/">Rudy Braquehais</a></strong></li>
<li><a href="http://www.imperial.ac.uk/people/alastair.donaldson">Alistair F. Donaldson</a></li>
<li><a href="https://andreamattavelli.github.io/">Andrea Mattavelli</a></li>
<li><a href="https://dblp.org/pers/c/Christi:Arpit.html">Arpit Christi</a></li>
<li><a href="">August Shi</a>http://mir.cs.illinois.edu/~awshi2/</li>
<li><a href="http://pages.cs.wisc.edu/~aws/">Aws Albarghouthi</a></li>
<li><a href="https://www.cis.upenn.edu/~bcpierce/">Benjamin C. Pierce</a></li>
<li><a href="http://pages.cs.wisc.edu/~cjsmith/">Calvin Smith</a></li>
<li><a href="https://www.carolemieux.com/">Caroline Lemieux</a></li>
<li>Chaoqiang Zhang</li>
<li><a href="https://www-users.cs.york.ac.uk/colin/">Colin Runciman</a></li>
<li><a href="https://dblp.uni-trier.de/pers/m/Marinov:Darko.html">Darko Marinov</a></li>
<li><a href="http://www.cs.utah.edu/~eeide">Eric Eide</a></li>
<li>Gabriel Ferns</li>
<li><a href="https://dl.acm.org/profile/81100394655">Giovanni Denaro</a></li>
<li>Javier Paris</li>
<li><a href="http://staffwww.dcs.shef.ac.uk/people/J.Derrick/">John Derrick</a></li>
<li><a href="http://www.cse.chalmers.se/~rjmh/">John Hughes</a></li>
<li>Josie Holmes</li>
<li><a href="http://staffwww.dcs.shef.ac.uk/people/K.Bogdanov/">Kirill Bogdanov</a></li>
<li><a href="https://people.eecs.berkeley.edu/~ksen">Koushik Sen</a></li>
<li><a href="https://lemonidas.github.io/">Leonidas Lampropoulos</a></li>
<li><a href="https://personal.utdallas.edu/~lxz144130/">Lingming Zhang</a></li>
<li><a href="https://www.inf.usi.ch/faculty/pezze/">Mauro Pezze</a></li>
<li><a href="https://scholar.google.se/citations?user=KGd-EW8AAAAJ&amp;hl=en">Maximilian Algehed</a></li>
<li><a href="http://www.cs.umd.edu/~mwh/">Michael Hicks</a></li>
<li><a href="https://sites.google.com/site/mikepapadakis/home">Mike Papadakis</a></li>
<li><a href="http://www.cse.chalmers.se/~jomoa/">Moa Johansson</a></li>
<li><a href="http://alipourm.github.io">Mohammad Amin Alipour</a></li>
<li><a href="http://www2.le.ac.uk/departments/informatics/people/neil-walkinshaw">Neil Walkinshaw</a></li>
<li><a href="https://sites.google.com/site/pietrobraione/">Pietro Braione</a></li>
<li><a href="https://rahul.gopinath.org">Rahul Gopinath</a></li>
<li><a href="https://rohan.padhye.org/">Rohan Padhye</a></li>
<li><a href="http://www.cs.utah.edu/~chenyang">Yang Chen</a></li>
<li><a href="https://wwwfr.uni.lu/snt/people/yves_le_traon">Yves Le Le Traon</a></li>
</ul>
<h4 id="papers-1">Papers</h4>
<ul>
<li><p><strong>Increasing Functional Coverage by Inductive Testing: A Case Study</strong> (<a href="https://hal.archives-ouvertes.fr/file/index/docid/1055254/filename/document.pdf">pdf</a>)<br> Neil Walkinshaw, Kirill Bogdanov, John Derrick, and Javier Paris.<br> In <em>Conference on Testing Software and Systems</em> (ICTSS). 2010.</p></li>
<li><p><strong>QuickSpec: Guessing Formal Specifications Using Testing</strong> (<a href="http://publications.lib.chalmers.se/records/fulltext/local_125255.pdf">pdf</a>)<br> Koen Claessen, Nicholas Smallbone, John Hughes.<br> In <em>Conference on Tests and Proofs</em> (TAP). 2010.</p></li>
<li><p><strong>Swarm Testing</strong> (<a href="http://www.cs.cmu.edu/~agroce/issta12.pdf">pdf</a>)<br> Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr.<br> In <em>International Symposium on Software Testing and Analysis</em> (ISSTA). 2012.</p></li>
<li><p><strong>FitSpec: refining property sets for functional testing</strong> (<a href="https://matela.com.br/papers/fitspec.pdf">pdf</a>)<br> Rudy Braquehais and Colin Runciman.<br> In <em>Symposium on Haskell</em> (Haskell). 2016.</p></li>
<li><p><strong>Generating Focused Random Tests Using Directed Swarm Testing</strong> (<a href="http://www.cs.cmu.edu/~agroce/issta16.pdf">pdf</a>)<br> Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi.<br> In <em>International Symposium on Software Testing and Analysis</em> (ISSTA). 2016.</p></li>
<li><p><strong>Quick Specifications for the Busy Programmer</strong> (<a href="http://www.cse.chalmers.se/~jomoa/papers/quickspec2016.pdf">pdf</a>)<br> Nicholas Smallbone, Moa Johansson, Koen Claessen, Maximilian Algehed.<br> In <em>Journal of Functional Programming</em> (JFP). 2017.</p></li>
<li><p><strong>Discovering Relational Specifications</strong> (<a href="http://pages.cs.wisc.edu/~aws/papers/fse17.pdf">pdf</a>)<br> Calvin Smith, Gabriel Ferns, and Aws Albarghouthi.<br> In <em>Foundations of Software Engineering</em> (FSE). 2017</p></li>
<li><p><strong>Speculate: Discovering Conditional Equations and Inequalities about Black-Box Functions by Reasoning from Test Results</strong> (<a href="https://matela.com.br/papers/speculate.pdf">pdf</a>)<br> Rudy Braquehais and Colin Runciman.<br> In <em>Symposium on Haskell</em> (Haskell). 2017.</p></li>
<li><p><strong>Coverage guided, property based testing</strong> (<a href="https://dl.acm.org/doi/10.1145/3360607">pdf</a>)<br> Leonidas Lampropoulos, Michael Hicks, and Benjamin C. Pierce.<br> In <em>International Conference on Object Oriented Programming Systems, Languages &amp; Applications</em> (OOPSLA). 2013.</p></li>
<li><p><strong>An Extensible, Regular-Expression-Based Tool for Multi-Language Mutant Generation</strong> (<a href="https://agroce.github.io/icse18t.pdf">pdf</a>)<br> Alex Groce, Josie Holmes, Darko Marinov, August Shi, and Lingming Zhang.<br> In <em>International Conference on Software Engineering</em> (Tool Demonstrations) (ICSE). 2018.</p></li>
<li><p><strong>SUSHI: A Test Generator for Programs with Complex Structured Inputs</strong> (<a href="https://andreamattavelli.github.io/papers/18-icse-demo.pdf">pdf</a>)<br> Pietro Braione, Giovanni Denaro, Andrea Mattavelli, Mauro Pezze.<br> In <em>International Conference on Software Engineering</em> (Tool Demonstrations) (ICSE). 2018.</p></li>
<li><p><strong>Semantic Fuzzing with Zest</strong> (<a href="https://dl.acm.org/doi/10.1145/3293882.3330576">pdf</a>)<br> Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Le Traon.<br> In <em>International Symposium on Software Testing and Analysis</em> (ISSTA). 2000.</p></li>
<li><p><strong>Test-Case Reduction via Test-Case Generation: Insights From the Hypothesis Reducer</strong> (<a href="https://drmaciver.github.io/papers/reduction-via-generation-preview.pdf">pdf</a>)<br> David R. MacIver and Alastair F. Donaldson<br> In <em>European Conference on Object-Oriented Programming</em> (ECOOP). 2020.</p></li>
</ul>
<h4 id="my-papers-1">My papers</h4>
<ul>
<li><strong>Cheap Remarks about Concurrent Programs</strong> (<a href="https://www.barrucadu.co.uk/publications/coco-flops18.pdf">pdf</a>)<br> Michael Walker and Colin Runciman.<br> In <em>Functional and Logic Programming Symposium</em> (FLOPS). 2018.</li>
</ul>
<h4 id="venues-1">Venues</h4>
<ul>
<li>European Conference on Object-Oriented Programming (ECOOP)</li>
<li>Haskell Symposium (Haskell)</li>
<li>International Conference on Object Oriented Programming Systems, Languages &amp; Applications (OOPSLA, now SPLASH)</li>
<li>International Conference on Software Engineering (ICSE)</li>
<li>International Conference on Testing Software and Systems (ICTSS)</li>
<li>International Conference on Tests and Proofs (TAP)</li>
<li>International Symposium on Software Testing and Analysis (ISSTA)</li>
<li>Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)</li>
<li>Journal of Functional Programming (JFP)</li>
</ul>
<h3 id="test-case-reduction">Test case reduction</h3>
<p>Once we have produced (either hand-written by a programmer or generated with a tool) a test case which exhibits some fault, we want to throw away all the incidental complexity, and find the simplest test case which exhibits the same bug.</p>
<p>This overlaps heavily with <strong>test case generation</strong>, but differs in motivation.</p>
<h4 id="people-2">People</h4>
<ul>
<li><strong><a href="http://www.cs.cmu.edu/~agroce">Alex Groce</a></strong></li>
<li><strong><a href="http://www.cs.utah.edu/~regehr">John Regehr</a></strong></li>
<li><a href="http://www.inf.u-szeged.hu/~akiss/">Ákos Kiss</a></li>
<li><a href="https://andreas-zeller.info/">Andreas Zeller</a></li>
<li>Chaoqiang Zhang</li>
<li><a href="https://cs.uwaterloo.ca/people-profiles/chengnian-sun">Chengnian Sun</a></li>
<li><a href="http://www.gmw6.com/">Ghassan Misherghi</a></li>
<li><a href="https://dl.acm.org/profile/99659215946">Jibesh Patra</a></li>
<li>Josie Holmes</li>
<li>Kevin Kellar</li>
<li><a href="http://software-lab.org/people/Michael_Pradel.html">Michael Pradel</a></li>
<li><a href="https://www.aminalipour.com/">Mohammed Amin Alipour</a></li>
<li><a href="https://helloqirun.github.io/">Qirun Zhang</a></li>
<li><a href="https://www.arschkrebs.de/">Ralf Hildebrandt</a></li>
<li>Renáta Hodován</li>
<li><a href="http://zhang-sai.github.io/">Sai Zhang</a></li>
<li>Satia Herfert</li>
<li><a href="http://gutianxiao.com/">Tianxiao Gu</a></li>
<li>Yang Chen</li>
<li><a href="https://www.scs.gatech.edu/people/yuanbo-li">Yuanbo Li</a></li>
<li><a href="https://people.inf.ethz.ch/suz/">Zhendong Su</a></li>
</ul>
<h4 id="papers-2">Papers</h4>
<ul>
<li><p><strong>Simplifying failure-inducing input</strong> (<a href="https://dl.acm.org/doi/10.1145/347324.348938">pdf</a>)<br> Ralf Hildebrandt and Andreas Zeller.<br> In <em>International Symposium on Software Testing and Analysis</em> (ISSTA). 2000.</p></li>
<li><p><strong>HDD: Hierarchical Delta Debugging</strong> (<a href="https://dl.acm.org/doi/10.1145/1134285.1134307">pdf</a>)<br> Ghassan Misherghi and Zhendong Su.<br> In <em>International Conference on Software Engineering</em> (ICSE). 2006.</p></li>
<li><p><strong>Practical Semantic Test Simplification</strong> (<a href="https://zhang-sai.github.io/pdf/zhang-icse13-nier.pdf">pdf</a>)<br> Sai Zhang.<br> In <em>International Conference on Software Engineering</em> (ICSE). 2013.</p></li>
<li><p><strong>Cause Reduction for Quick Testing</strong> (<a href="http://www.cs.utah.edu/~regehr/papers/icst14.pdf">pdf</a>)<br> Alex Groce, Mohammed Amin Alipour, Chaoqiang Zhang, Yang Chen, and John Regehr.<br> In <em>International Conference on Software Testing, Verification and Validation</em> (ICST). 2014.</p></li>
<li><p><strong>Practical Improvements to the Minimizing Delta Debugging Algorithm</strong> (<a href="https://www.scitepress.org/Link.aspx?doi=10.5220%2f0005988602410248">pdf</a>)<br> Renáta Hodován and Ákos Kiss<br> In <em>International Joint Conference on Software Technologies</em> (ICSOFT). 2016.</p></li>
<li><p><strong>One Test to Rule Them All</strong> (<a href="https://www.cefns.nau.edu/~adg326/issta17.pdf">pdf</a>)<br> Alex Groce, Josie Holmes, and Kevin Kellar.<br> In <em>International Symposium on Software Tetsing and Analysis</em> (ISSTA). 2017.</p></li>
<li><p><strong>Automatically reducing tree-structured test inputs</strong> (<a href="https://dl.acm.org/doi/10.5555/3155562.3155669">pdf</a>)<br> Satia Herfert, Jibesh Patra, and Michael Pradel.<br> In <em>International Conference on Automated Software Engineering</em> (ASE). 2017.</p></li>
<li><p><strong>Perses: syntax-guided program reduction</strong> (<a href="https://dl.acm.org/doi/10.1145/3180155.3180236">pdf</a>)<br> Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su.<br> In <em>International Conference on Software Engineering</em> (ICSE). 2018.</p></li>
</ul>
<h4 id="venues-2">Venues</h4>
<ul>
<li>International Conference on Automated Software Engineering (ASE)</li>
<li>International Conference on Software Engineering (ICSE)</li>
<li>International Conference on Software Testing, Verification and Validation (ICST)</li>
<li>International Joint Conference on Software Technologies (ICSOFT)</li>
<li>International Symposium on Software Testing and Analysis (ISSTA)</li>
</ul>
<h2 id="how-to-review-this-memo">How to review this memo</h2>
<ol type="1">
<li><p>Check when the last change was. Hopefully I was kind enough at the time to note down the latest edition of all the conferences.</p></li>
<li><p>Look through new conference proceedings. Pick out papers which have interesting titles.</p></li>
<li><p>Examine the list of papers<a href="interesting-research.html#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a>. Throw away those which don’t fit, record those which do.</p></li>
<li><p>Look through people’s websites. Again pick out papers with interesting titles.</p></li>
<li><p>If super keen, search for some keywords on Google Scholar, or look up reverse citations of papers in these lists.</p></li>
<li><p>Look through references of papers newly added for more candidate papers, people, and conferences. Prune by titles, then examine, then add. Repeat until I get bored.</p></li>
<li><p>Decide if anyone needs to be bolded (or unbolded).</p></li>
</ol>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p><strong>Bold people</strong> are those who, at the time of writing, I considered to be key figures in the field to keep an eye on.<a href="interesting-research.html#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2"><p>There are a lot of papers, it would take an inordinate amount of time for me to carefully read every single one which looks interesting. But usually it’s good enough to just read enough to get the gist, and then read individual papers more closely as time, interest, and need permit. <br><br> For this initial examination, I do something like the first two passes in <a href="https://web.stanford.edu/class/ee384m/Handouts/HowtoReadPaper.pdf">How to Read a Paper</a>. My second pass is usually a skim reading, unless the paper really grabs me; I save a proper second pass for when I read the paper more fully.<a href="interesting-research.html#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>dejafu-2.0.0.0</title>
    <link href="https://memo.barrucadu.co.uk/dejafu-2.0.0.0.html" />
    <id>https://memo.barrucadu.co.uk/dejafu-2.0.0.0.html</id>
    <published>2019-02-12T00:00:00Z</published>
    <updated>2019-02-12T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>(this message has also been sent to <a href="https://www.reddit.com/r/haskell/comments/aq09u5/ann_dejafu2000_a_library_for_unittesting/">/r/haskell</a> and <a href="https://mail.haskell.org/pipermail/haskell-cafe/2019-February/130694.html">haskell-cafe</a>)</p>
<hr />
<p>I’m pleased to announce a new super-major release of <a href="http://hackage.haskell.org/package/dejafu">dejafu</a>, a library for testing concurrent Haskell programs.</p>
<p>While there are breaking changes, common use-cases shouldn’t be affected too significantly (or not at all). There is a brief guide to the changes, and how to migrate if necessary, <a href="https://dejafu.readthedocs.io/en/latest/migration_1x_2x.html">on the website</a>.</p>
<h2 id="whats-dejafu">What’s dejafu?</h2>
<p>dejafu is a unit-testing library for concurrent Haskell programs. Tests are deterministic, and work by systematically exploring the possible schedules of your concurrency-using test case, allowing you to confidently check your threaded code.</p>
<p><a href="http://hackage.haskell.org/package/hunit-dejafu">HUnit</a> and <a href="http://hackage.haskell.org/package/tasty-dejafu">Tasty</a> bindings are available.</p>
<p>dejafu requires your test case to be written against the <code>MonadConc</code> typeclass from the <a href="http://hackage.haskell.org/package/concurrency">concurrency</a> package. This is a necessity, dejafu cannot peek inside your <code>IO</code> or <code>STM</code> actions, so it needs to be able to plug in an alternative implementation of the concurrency primitives for testing. There is some guidance for how to switch from <code>IO</code> code to <code>MonadConc</code> code <a href="https://dejafu.readthedocs.io/en/latest/typeclass.html">on the website</a>.</p>
<p>If you really need <code>IO</code>, you can use <code>MonadIO</code> - but make sure it’s deterministic enough to not invalidate your tests!</p>
<p>Here’s a small example reproducing a deadlock found in an earlier version of the <a href="http://hackage.haskell.org/package/auto-update">auto-update</a> library:</p>
<pre><code>&gt; :{
autocheck $ do
  auto &lt;- mkAutoUpdate defaultUpdateSettings
  auto
:}
[fail] Successful
    [deadlock] S0--------S1-----------S0-
[fail] Deterministic
    [deadlock] S0--------S1-----------S0-

    () S0--------S1--------p0--</code></pre>
<p>dejafu finds the deadlock, and gives a simplified execution trace for each distinct result. More in-depth traces showing exactly what each thread did are also available. This is using a version of auto-update modified to use the <code>MonadConc</code> typeclass. The source is in the <a href="https://github.com/barrucadu/dejafu/blob/master/dejafu-tests/lib/Examples/AutoUpdate.hs">dejafu testsuite</a>.</p>
<h2 id="whats-new">What’s new?</h2>
<p>The highlights for this release are setup actions, teardown actions, and invariants:</p>
<ul>
<li><p><strong>Setup actions</strong> are for things which are not really a part of your test case, but which are needed for it (for example, setting up a test distributed system). As dejafu can run a single test case many times, repeating this work can be a significant overhead. By defining this as a setup action, dejafu can “snapshot” the state at the end of the action, and efficiently reload it in subsequent executions of the same test.</p></li>
<li><p><strong>Teardown actions</strong> are for things you want to run after your test case completes, in all cases, even if the test deadlocks (for example). As dejafu controls the concurrent execution of the test case, inspecting shared state is possible even if the test case fails to complete.</p></li>
<li><p><strong>Invariants</strong> are effect-free atomically-checked conditions over shared state which must always hold. If an invariant throws an exception, the test case is aborted, and any teardown action run.</p></li>
</ul>
<p>Here is an example of a setup action with an invariant:</p>
<pre><code>&gt; :{
autocheck $
  let setup = do
        var &lt;- newEmptyMVar
        registerInvariant $ do
          value &lt;- inspectMVar var
          when (value == Just 1) $
            throwM Overflow
        pure var
  in withSetup setup $ \var -&gt; do
       fork $ putMVar var 0
       fork $ putMVar var 1
       tryReadMVar var
:}
[fail] Successful
    [invariant failure] S0--P2-
[fail] Deterministic
    [invariant failure] S0--P2-

    Nothing S0----

    Just 0 S0--P1--S0--</code></pre>
<p>In the <code>[invariant failure]</code> case, thread 2 is scheduled, writing the forbidden value “1” to the MVar, which terminates the test.</p>
<p>Here is an example of a setup action with a teardown action:</p>
<pre><code>&gt; :{
autocheck $
  let setup = newMVar ()
      teardown var (Right _) = show &lt;$&gt; tryReadMVar var
      teardown _   (Left  e) = pure (show e)
  in withSetupAndTeardown setup teardown $ \var -&gt; do
       fork $ takeMVar var
       takeMVar var
:}
[pass] Successful
[fail] Deterministic
    &quot;Nothing&quot; S0---

    &quot;Deadlock&quot; S0-P1--S0-</code></pre>
<p>The teardown action can perform arbitrary concurrency effects, including inspecting any mutable state returned by the setup action.</p>
<p>Setup and teardown actions were previously available in a slightly different form as the <code>dontCheck</code> and <code>subconcurrency</code> functions, which have been removed (see the migration guide if you used these).</p>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Simplifying Execution Traces</title>
    <link href="https://memo.barrucadu.co.uk/simplifying-execution-traces.html" />
    <id>https://memo.barrucadu.co.uk/simplifying-execution-traces.html</id>
    <published>2018-03-08T00:00:00Z</published>
    <updated>2018-03-08T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>It’s well known that randomly generated test failures are a poor debugging aid. That’s why every non-toy randomised property testing library (like <a href="http://hackage.haskell.org/package/hedgehog">Hedgehog</a> or <a href="https://github.com/HypothesisWorks/hypothesis-python">Hypothesis</a> or <a href="http://hackage.haskell.org/package/QuickCheck">QuickCheck</a>) puts a considerable amount of effort into shrinking failures. It’s a non-trivial problem, but it’s absolutely essential.</p>
<p>It’s also something that dejafu does not do.</p>
<h2 id="running-example">Running example</h2>
<p>I’m going to use the “stores are transitively visible” litmus test as a running example. Here it is:</p>
<pre class="haskell"><code>import qualified Control.Monad.Conc.Class as C
import           Test.DejaFu.Internal
import           Test.DejaFu.SCT
import           Test.DejaFu.SCT.Internal.DPOR
import           Test.DejaFu.Types
import           Test.DejaFu.Utils

storesAreTransitivelyVisible :: C.MonadConc m =&gt; m (Int, Int, Int)
storesAreTransitivelyVisible = do
  x &lt;- C.newCRef 0
  y &lt;- C.newCRef 0
  j1 &lt;- C.spawn (C.writeCRef x 1)
  j2 &lt;- C.spawn (do r1 &lt;- C.readCRef x; C.writeCRef x 1; pure r1)
  j3 &lt;- C.spawn (do r2 &lt;- C.readCRef y; r3 &lt;- C.readCRef x; pure (r2,r3))
  (\() r1 (r2,r3) -&gt; (r1,r2,r3)) &lt;$&gt; C.readMVar j1 &lt;*&gt; C.readMVar j2 &lt;*&gt; C.readMVar j3</code></pre>
<p>I picked this one because it’s kind of arbitrarily complex. It’s a small test, but it’s for the relaxed memory implementation, so there’s a lot going on. It’s a fairly dense test.</p>
<p>I’m now going to define a metric of trace complexity which I’ll justify in a moment:</p>
<pre class="haskell"><code>complexity :: Trace -&gt; (Int, Int, Int, Int)
complexity = foldr go (0,0,0,0) where
  go (SwitchTo _, _, CommitCRef _ _) (w, x, y, z) = (w+1, x+1, y,   z)
  go (Start    _, _, CommitCRef _ _) (w, x, y, z) = (w+1, x,   y+1, z)
  go (Continue,   _, CommitCRef _ _) (w, x, y, z) = (w+1, x,   y,   z+1)
  go (SwitchTo _, _, _)              (w, x, y, z) = (w,   x+1, y,   z)
  go (Start    _, _, _)              (w, x, y, z) = (w,   x,   y+1, z)
  go (Continue,   _, _)              (w, x, y, z) = (w,   x,   y,   z+1)</code></pre>
<p>Using the <code>183-shrinking</code> branch, we can now get the first trace for every distinct result, along with its complexity:</p>
<pre class="haskell"><code>results :: Way -&gt; MemType -&gt; IO ()
results way memtype = do
  let settings = set lequality (Just (==))
               $ fromWayAndMemType way memtype
  res &lt;- runSCTWithSettings settings storesAreTransitivelyVisible
  flip mapM_ res $ \(efa, trace) -&gt;
    putStrLn (show efa ++ &quot;\t&quot; ++ showTrace trace ++ &quot;\t&quot; ++ show (complexity trace))</code></pre>
<p>Here are the results for systematic testing:</p>
<pre><code>λ&gt; results (systematically defaultBounds) SequentialConsistency
Right (1,0,1)   S0------------S1---S0--S2-----S0--S3-----S0--   (0,0,7,24)
Right (0,0,1)   S0------------S2-----S1---S0---S3-----S0--      (0,0,6,24)
Right (0,0,0)   S0------------S2-P3-----S1---S0--S2----S0---    (0,1,6,23)
Right (1,0,0)   S0------------S3-----S1---S0--S2-----S0---      (0,0,6,24)

λ&gt; results (systematically defaultBounds) TotalStoreOrder
Right (1,0,1)   S0------------S1---S0--S2-----S0--S3-----S0--   (0,0,7,24)
Right (0,0,1)   S0------------S1-P2-----S1--S0---S3-----S0--    (0,1,6,23)
Right (0,0,0)   S0------------S1-P2---P3-----S1--S0--S2--S0---  (0,2,6,22)
Right (1,0,0)   S0------------S1-P3-----S1--S0--S2-----S0---    (0,1,6,23)

λ&gt; results (systematically defaultBounds) PartialStoreOrder
Right (1,0,1)   S0------------S1---S0--S2-----S0--S3-----S0--   (0,0,7,24)
Right (0,0,1)   S0------------S1-P2-----S1--S0---S3-----S0--    (0,1,6,23)
Right (0,0,0)   S0------------S1-P2---P3-----S1--S0--S2--S0---  (0,2,6,22)
Right (1,0,0)   S0------------S1-P3-----S1--S0--S2-----S0---    (0,1,6,23)</code></pre>
<p>Pretty messy, right? Here’s the results for <em>random</em> testing:</p>
<pre><code>λ&gt; results (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0-----P1-P0----P2-P1-P0-P3-P1-S2-P3--P0-P3-P0-P3-S2-P0-S2-P0--P2-S0-   (0,15,5,9)
Right (0,0,1)   S0-------P2-P1-P2-P0--P2-P0-P1-P0---S2-P3-P0-P2-S3---P1-S3-S0--         (0,12,5,12)
Right (1,0,0)   S0------------S3-----S1-P2-P1-P0--S2---P1-S0---                         (0,4,5,20)
Right (0,0,0)   S0---------P2-P0--P3-P0-S3--P2-P3-P2--P3-S2-S1--P0----                  (0,9,4,15)

λ&gt; results (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0-----P1--P0-P1-S0-P2--C-S0---P2-P3-P2--S3-P0-P3-P0---S3-P0-P3-S0-     (1,13,6,11)
Right (0,0,1)   S0----P1-P0-----P2--P0--P2-P0-S2--S3-P1-P0---S1-S3----S0--              (0,8,6,16)
Right (0,0,0)   S0--------P2-P0--P3-P2-P0-P3-P2-C-S0-S3---S2--S1-C-S1-P0----            (2,10,6,14)

λ&gt; results (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0-----P1--P0-P1-S0-P2--C-S0---P2-P3-P2--S3-P0-P3-P0---S3-P0-P3-S0-     (1,13,6,11)
Right (0,0,1)   S0----P1-P0-----P2--P0--P2-P0-S2--S3-P1-P0---S1-S3----S0--              (0,8,6,16)
Right (0,0,0)   S0--------P2-P0--P3-P2-P0-P3-P2-C-S0-S3---S2--S1-C-S1-P0----            (2,10,6,14)</code></pre>
<p>Yikes!</p>
<p>The complexity metric I defined counts four things:</p>
<ol type="1">
<li>The number of relaxed-memory commit actions</li>
<li>The number of pre-emptive context switches</li>
<li>The number of non-pre-emptive context switches</li>
<li>The number of continues</li>
</ol>
<p>I would much rather read a long trace where the only context switches are when threads block, than a short one which is rapidly jumping between threads. So, given two equivalent traces, I will always prefer the one with a lexicographically smaller complexity-tuple.</p>
<h2 id="trace-simplification">Trace simplification</h2>
<p>The key idea underpinning trace simplification is that dejafu can tell when two scheduling decisions can be swapped without changing the behaviour of the program. I talked about this idea in the <a href="hedgehog-dejafu.html">Using Hedgehog to Test Déjà Fu</a> memo. So we can implement transformations which are guaranteed to preserve semantics <em>without needing to verify this by re-running the test case</em>.</p>
<p>Although we don’t need to re-run the test case at all, the <code>183-shrinking</code> branch currently does, but only once at the end after the minimum has been found. This is because it’s easier to generate a simpler sequence of scheduling decisions and use dejafu to produce the corresponding trace than it is to produce a simpler trace directly. This is still strictly better than a typical shrinking algorithm, which would re-run the test case after <em>each</em> shrinking step, rather than only at the end.</p>
<p>Rather than drag this out, here’s what those random traces simplify to:</p>
<pre class="haskell"><code>resultsS :: Way -&gt; MemType -&gt; IO ()
resultsS way memtype = do
  let settings = set lsimplify True
               . set lequality (Just (==))
               $ fromWayAndMemType way memtype
  res &lt;- runSCTWithSettings settings storesAreTransitivelyVisible
  flip mapM_ res $ \(efa, trace) -&gt;
    putStrLn (show efa ++ &quot;\t&quot; ++ showTrace trace ++ &quot;\t&quot; ++ show (complexity trace))</code></pre>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0----------P1---S2--P3-----S0---S2---S0---     (0,2,5,22)
Right (0,0,1)   S0----------P2-P1-P2-P1--S0---S2---S3-----S0--- (0,4,5,20)
Right (1,0,0)   S0------------S3-----S1---S0--S2----P0---       (0,1,5,23)
Right (0,0,0)   S0------------S3--P2-----S3---S1--P0----        (0,2,4,22)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0----------P1---S2-----S0----S3-----S0--       (0,1,5,23)
Right (0,0,1)   S0----------P1-P2-----S0--S1--S0---S3-----S0--  (0,2,6,22)
Right (0,0,0)   S0----------P2--P3-----S0--S2---S1--P0----      (0,3,4,21)

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0----------P1---S2-----S0----S3-----S0--       (0,1,5,23)
Right (0,0,1)   S0----------P1-P2-----S0--S1--S0---S3-----S0--  (0,2,6,22)
Right (0,0,0)   S0----------P2--P3-----S0--S2---S1--P0----      (0,3,4,21)</code></pre>
<p>This is much better.</p>
<p>There are two simplification phases: a preparation phase, which puts the trace into a normal form and prunes unnecessary commits; and an iteration phase, which repeats a step function until a fixed point is reached (or the iteration limit is).</p>
<h3 id="preparation">Preparation</h3>
<p>The preparation phase has two steps: first we put the trace into <em>lexicographic normal form</em>, then we prune unnecessary commits.</p>
<p>We put a trace in lexicographic normal form by sorting by thread ID, where only independent actions can be swapped:</p>
<pre class="haskell"><code>lexicoNormalForm :: MemType -&gt; [(ThreadId, ThreadAction)] -&gt; [(ThreadId, ThreadAction)]
lexicoNormalForm memtype = go where
  go trc =
    let trc&#39; = bubble initialDepState trc
    in if trc == trc&#39; then trc else go trc&#39;

  bubble ds (t1@(tid1, ta1):t2@(tid2, ta2):trc)
    | independent ds tid1 ta1 tid2 ta2 &amp;&amp; tid2 &lt; tid1 = bgo ds t2 (t1 : trc)
    | otherwise = bgo ds t1 (t2 : trc)
  bubble _ trc = trc

  bgo ds t@(tid, ta) trc = t : bubble (updateDepState memtype ds tid ta) trc</code></pre>
<p>If simplification only put traces into lexicographic normal form, we would get these results:</p>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0-----------P1---S2--P0--S2--P0-P3----P0--             (0,5,3,19)
Right (0,0,1)   S0-----------P2-P1-P2-P1-P0--S2--P0-P1-S2-S3----P0--    (0,8,4,16)
Right (1,0,0)   S0------------S3----P1--P0--S1-S2----P0---              (0,3,4,21)
Right (0,0,0)   S0------------S2-P3--P2----S3--P1--P0----               (0,4,3,20)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0-------P1---S2--C-S0-----P2--P0--S2-S3----P0--        (1,5,5,19)
Right (0,0,1)   S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0--   (0,7,5,17)
Right (0,0,0)   S0-----------P2---P3--C-S0-S2--S3--P1-C-S1-P0----       (2,6,5,18)

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0-------P1---S2--C-S0-----P2--P0--S2-S3----P0--        (1,5,5,19)
Right (0,0,1)   S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0--   (0,7,5,17)
Right (0,0,0)   S0-----------P2---P3--C-S0-S2--S3--P1-C-S1-P0----       (2,6,5,18)</code></pre>
<p>These are better than they were, but we can do better still.</p>
<p>After putting the trace into lexicographic normal form, we delete any commit actions which are followed by any number of independent actions and then a memory barrier:</p>
<pre class="haskell"><code>dropCommits :: MemType -&gt; [(ThreadId, ThreadAction)] -&gt; [(ThreadId, ThreadAction)]
dropCommits SequentialConsistency = id
dropCommits memtype = go initialDepState where
  go ds (t1@(tid1, ta1@(CommitCRef _ _)):t2@(tid2, ta2):trc)
    | isBarrier (simplifyAction ta2) = go ds (t2:trc)
    | independent ds tid1 ta1 tid2 ta2 = t2 : go (updateDepState memtype ds tid2 ta2) (t1:trc)
  go ds (t@(tid,ta):trc) = t : go (updateDepState memtype ds tid ta) trc
  go _ [] = []</code></pre>
<p>Such commits don’t affect the behaviour of the program at all, as all buffered writes gets flushed when the memory barrier happens.</p>
<p>If simplification only did the preparation phase, we would get these results:</p>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0-----------P1---S2--P0--S2--P0-P3----P0--             (0,5,3,19)
Right (0,0,1)   S0-----------P2-P1-P2-P1-P0--S2--P0-P1-S2-S3----P0--    (0,8,4,16)
Right (1,0,0)   S0------------S3----P1--P0--S1-S2----P0---              (0,3,4,21)
Right (0,0,0)   S0------------S2-P3--P2----S3--P1--P0----               (0,4,3,20)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0-------P1---S2--P0-----P2--P0--S2-S3----P0--          (0,5,4,19)
     ^-- better than just lexicoNormalForm
Right (0,0,1)   S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0--   (0,7,5,17)
Right (0,0,0)   S0-----------P2---P3--P0-S2--S3--P1--P0----             (0,5,3,19)
     ^-- better than just lexicoNormalForm

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0-------P1---S2--P0-----P2--P0--S2-S3----P0--          (0,5,4,19)
     ^-- better than just lexicoNormalForm
Right (0,0,1)   S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0--   (0,7,5,17)
Right (0,0,0)   S0-----------P2---P3--P0-S2--S3--P1--P0----             (0,5,3,19)
     ^-- better than just lexicoNormalForm</code></pre>
<h3 id="iteration">Iteration</h3>
<p>The iteration phase attempts to reduce context switching by pushing actions forwards, or pulling them backwards, through the trace.</p>
<p>If we have the trace <code>[(tid1, act1), (tid2, act2), (tid1, act3)]</code>, where <code>act2</code> and <code>act3</code> are independent, the “pull back” transformation would re-order that to <code>[(tid1, act1), (tid1, act3), (tid2, act2)]</code>.</p>
<p>In contrast, if <code>act1</code> and <code>act2</code> were independent, the “push forward” transformation would re-order that to <code>[(tid2, act2), (tid1, act1), (tid1, act3)]</code>. The two transformations are almost, but not quite opposites.</p>
<p>Pull-back walks through the trace and, at every context switch, looks forward to see if there is a single action of the original thread it can put before the context switch:</p>
<pre class="haskell"><code>pullBack :: MemType -&gt; [(ThreadId, ThreadAction)] -&gt; [(ThreadId, ThreadAction)]
pullBack memtype = go initialDepState where
  go ds (t1@(tid1, ta1):trc@((tid2, _):_)) =
    let ds&#39; = updateDepState memtype ds tid1 ta1
        trc&#39; = if tid1 /= tid2
               then maybe trc (uncurry (:)) (findAction tid1 ds&#39; trc)
               else trc
    in t1 : go ds&#39; trc&#39;
  go _ trc = trc

  findAction tid0 = fgo where
    fgo ds (t@(tid, ta):trc)
      | tid == tid0 = Just (t, trc)
      | otherwise = case fgo (updateDepState memtype ds tid ta) trc of
          Just (ft@(ftid, fa), trc&#39;)
            | independent ds tid ta ftid fa -&gt; Just (ft, t:trc&#39;)
          _ -&gt; Nothing
    fgo _ _ = Nothing</code></pre>
<p>Push-forward walks through the trace and, at every context switch, looks forward to see if the last action of the original thread can be put at its next execution:</p>
<pre class="haskell"><code>pushForward :: MemType -&gt; [(ThreadId, ThreadAction)] -&gt; [(ThreadId, ThreadAction)]
pushForward memtype = go initialDepState where
  go ds (t1@(tid1, ta1):trc@((tid2, _):_)) =
    let ds&#39; = updateDepState memtype ds tid1 ta1
    in if tid1 /= tid2
       then maybe (t1 : go ds&#39; trc) (go ds) (findAction tid1 ta1 ds trc)
       else t1 : go ds&#39; trc
  go _ trc = trc

  findAction tid0 ta0 = fgo where
    fgo ds (t@(tid, ta):trc)
      | tid == tid0 = Just ((tid0, ta0) : t : trc)
      | independent ds tid0 ta0 tid ta = (t:) &lt;$&gt; fgo (updateDepState memtype ds tid ta) trc
      | otherwise = Nothing
    fgo _ _ = Nothing</code></pre>
<p>The iteration process just repeats <code>pushForward memtype . pullBack memtype</code>.</p>
<p>If it only used <code>pullBack</code>, we would get these results:</p>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0-----------P1---S2---P0--S2--S0-P3-----S0--           (0,3,5,21)
Right (0,0,1)   S0-----------P2-P1-P2--P1--S0--S2--S0-P3-----S0--       (0,5,5,19)
Right (1,0,0)   S0------------S3-----S1---S0--S2----P0---               (0,1,5,23)
Right (0,0,0)   S0------------S2-P3---P2----S3--S1--P0----              (0,3,4,21)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0-----------P1---S2-----S0---S3-----S0--               (0,1,5,23)
Right (0,0,1)   S0-----------P1-P2-----S0-S1--S0---S3-----S0--          (0,2,6,22)
Right (0,0,0)   S0-----------P2---P3-----S0-S2--S1--P0----              (0,3,4,21)

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0-----------P1---S2-----S0---S3-----S0--               (0,1,5,23)
Right (0,0,1)   S0-----------P1-P2-----S0-S1--S0---S3-----S0--          (0,2,6,22)
Right (0,0,0)   S0-----------P2---P3-----S0-S2--S1--P0----              (0,3,4,21)</code></pre>
<p>With no exception, iterating <code>pullBack</code> is an improvement over just doing preparation.</p>
<p>If it only used <code>pushForward</code>, we would get these results:</p>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0-------P1---S2--P0------S2--P3----P0---               (0,4,3,20)
Right (0,0,1)   S0-------P2-P1-P2-P1-P0------S1-S2---S3----P0---        (0,6,4,18)
Right (1,0,0)   S0------------S3----P1--P0--S1-S2----P0---              (0,3,4,21)
     ^-- no improvement over preparation
Right (0,0,0)   S0------------S3--P2-----S3--P1--P0----                 (0,3,3,21)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0----P1---S0---P2----P0-------S2-S3----P0--            (0,4,4,20)
Right (0,0,1)   S0-------P1-P2--P0-----S1-P2--P0---S1-S2-S3----P0--     (0,6,5,18)
Right (0,0,0)   S0----------P2--P3--P0--S2---S3--P1--P0----             (0,5,3,19)
     ^-- no improvement over preparation

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0----P1---S0---P2----P0-------S2-S3----P0--            (0,4,4,20)
Right (0,0,1)   S0-------P1-P2--P0-----S1-P2--P0---S1-S2-S3----P0--     (0,6,5,18)
Right (0,0,0)   S0----------P2--P3--P0--S2---S3--P1--P0----             (0,5,3,19)
     ^-- no improvement over preparation</code></pre>
<p>With three exceptions, where the traces didn’t change, iterating <code>pushForward</code> is an improvement over just doing preparation.</p>
<p>We’ve already seen the results if we combine them:</p>
<pre><code>λ&gt; resultsS (randomly (mkStdGen 0) 100) SequentialConsistency
Right (1,0,1)   S0----------P1---S2--P3-----S0---S2---S0---     (0,2,5,22)
Right (0,0,1)   S0----------P2-P1-P2-P1--S0---S2---S3-----S0--- (0,4,5,20)
Right (1,0,0)   S0------------S3-----S1---S0--S2----P0---       (0,1,5,23)
     ^-- same as pullBack, which is better than pushForward
Right (0,0,0)   S0------------S3--P2-----S3---S1--P0----        (0,2,4,22)

λ&gt; resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder
Right (1,0,1)   S0----------P1---S2-----S0----S3-----S0--       (0,1,5,23)
     ^-- same as pullBack, which is better than pushForward
Right (0,0,1)   S0----------P1-P2-----S0--S1--S0---S3-----S0--  (0,2,6,22)
     ^-- same as pullBack, which is better than pushForward
Right (0,0,0)   S0----------P2--P3-----S0--S2---S1--P0----      (0,3,4,21)

λ&gt; resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder
Right (1,0,1)   S0----------P1---S2-----S0----S3-----S0--       (0,1,5,23)
     ^-- same as pullBack, which is better than pushForward
Right (0,0,1)   S0----------P1-P2-----S0--S1--S0---S3-----S0--  (0,2,6,22)
     ^-- same as pullBack, which is better than pushForward
Right (0,0,0)   S0----------P2--P3-----S0--S2---S1--P0----      (0,3,4,21)</code></pre>
<h2 id="next-steps">Next steps</h2>
<p>I think what I have right now is pretty good. It’s definitely a vast improvement over not doing any simplification.</p>
<p><em>But</em>, no random traces get simplified to the corresponding systematic traces, which is a little disappointing. I think that’s because the current passes just try to reduce context switches of any form, whereas really I want to reduce pre-emptive context switches more than non-pre-emptive ones.</p>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Using Hedgehog to Test Déjà Fu</title>
    <link href="https://memo.barrucadu.co.uk/hedgehog-dejafu.html" />
    <id>https://memo.barrucadu.co.uk/hedgehog-dejafu.html</id>
    <published>2018-02-11T00:00:00Z</published>
    <updated>2018-02-11T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>Déjà Fu is a concurrency testing library, and one thing you definitely <em>don’t</em> want to do when testing concurrent programs is to try every possible interleaving of threads.</p>
<p>Trying every possible interleaving will give you, in general, an exponential blow-up of the executions you need to perform as your test case grows in size. The core testing algorithm we use, a variant of dynamic partial-order reduction (DPOR)<a href="hedgehog-dejafu.html#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>, attempts to reduce this blow-up. DPOR identifies actions which are <em>dependent</em>, and only tries interleavings which permute dependent actions.</p>
<p>Here are some examples:</p>
<ul>
<li><p>It doesn’t matter which order two threads execute <code>readMVar</code>, for the same <code>MVar</code>. These actions are <em>independent</em>.</p></li>
<li><p>It does matter which order two threads execute <code>putMVar</code>, for the same <code>MVar</code>. These actions are <em>dependent</em>.</p></li>
<li><p>It doesn’t matter which order two threads execute <code>putMVar</code> for different <code>MVar</code>s. These actions are <em>independent</em>.</p></li>
</ul>
<p>Two actions are dependent if the order in which they are performed matters.</p>
<p>So the intuition behind DPOR is that most actions in a concurrent program are <em>independent</em>. DPOR won’t help you much if you have a single piece of shared state which every thread is hitting, but most concurrent programs aren’t like that. The worst case is still a terrible exponential blow-up, but the average case is much better.</p>
<p>The dependency relation is <em>core</em> part of Déjà Fu today. It has impacts on both performance and correctness. If it says two actions are dependent when they are not, then we may see unnecessary interleavings tried. If it says two actions are not dependent when they really are, then we may miss necessary interleavings.</p>
<p>Being such an important component, it must be well-tested, right? Well, sort of. The Déjà Fu testsuite mostly consists of small concurrent programs together with a list of expected outputs, testing that Déjà Fu finds all the nondeterminism in the program. This does exercise the dependency relation, but only very indirectly.</p>
<h2 id="the-idea">The Idea</h2>
<p>There things would have remained had I not experienced one of those coincidence-driven flashes of insight:</p>
<ul>
<li><p><a href="https://github.com/aherrmann">aherrmann</a> opened an <a href="https://github.com/barrucadu/dejafu/issues/181">issue on GitHub</a> asking how to take an execution trace and replay it.</p></li>
<li><p><a href="https://www.reddit.com/user/agnishom">agnishom</a> posted a <a href="https://www.reddit.com/r/algorithms/comments/7vo0el/checking_equivalence_of_trace_elements/">thread on /r/algorithms</a> asking how to check the equivalence of traces where only some elements commute.</p></li>
</ul>
<p>I had my idea. I can <em>directly</em> test the dependency relation like so:</p>
<ol type="1">
<li>Execute a concurrent program.</li>
<li>Normalise its execution trace in some way.</li>
<li>“Replay” the normalised trace.</li>
<li>Assert that the result is the same.</li>
</ol>
<h2 id="normalising-traces">Normalising Traces</h2>
<p>So, what is a good normal form for a trace? I tried out a few approaches here, but there was one I kept coming back to: we should shuffle around independent actions to keep the program on the main thread for as long as possible.</p>
<p>There are two reasons I think this works well. (1) The traces we get will be easier for a human to read, as the program will stay on its main thread and only execute another thread where necessary. (2) A Haskell program terminates when the main thread terminates, so by executing the main thread as much as possible, we may find that some actions don’t need to be executed at all.</p>
<p>So firstly we need to know when two actions commute. Let’s just use the dependency relation for that:</p>
<pre class="haskell"><code>-- | Check if two actions commute.
independent
  :: DepState
  -&gt; (ThreadId, ThreadAction)
  -&gt; (ThreadId, ThreadAction)
  -&gt; Bool
independent ds (tid1, ta1) (tid2, ta2) = not (dependent ds tid1 ta1 tid2 ta2)</code></pre>
<p>The <code>DepState</code> parameter tracks information about the history of the execution, allowing us to make better decisions. For example: while in general it matters in which order two <code>putMVar</code>s to the same <code>MVar</code> happen; it <em>doesn’t</em> matter if the <code>MVar</code> is already full, as both actions will block without achieving anything.</p>
<p>The approach works well in practice, but has been the source of <em>so many</em> off-by-one errors. Even while writing this memo!</p>
<p>So now onto trace normalisation. The easiest way to do it is bubble sort, but with an additional constraint on when we can swap things:</p>
<ol type="1">
<li>For every adjacent pair of items <code>x</code> and <code>y</code> in the trace:
<ol type="1">
<li>If <code>x</code> and <code>y</code> commute and <code>thread_id y &lt; thread_id x</code>:
<ol type="1">
<li>Swap <code>x</code> and <code>y</code>.</li>
</ol></li>
<li>Update the <code>DepState</code> and continue to the next pair.</li>
</ol></li>
<li>Repeat until there are no more changes.</li>
</ol>
<p>And here’s the code:</p>
<pre class="haskell"><code>-- | Rewrite a trace into a canonical form.
normalise
  :: [(ThreadId, ThreadAction)]
  -&gt; [(ThreadId, ThreadAction)]
normalise trc0 = if changed then normalise trc&#39; else trc&#39;
 where
  (changed, trc&#39;) = bubble initialDepState False trc0

  bubble ds flag ((x@(tid1, _)):(y@(tid2, _)):trc)
    | independent ds x y &amp;&amp; tid2 &lt; tid1 = go ds True y (x : trc)
    | otherwise = go ds flag x (y : trc)
  bubble _ flag trc = (flag, trc)

  go ds flag t@(tid, ta) trc =
    second (t :) (bubble (updateDepState ds tid ta) flag trc)</code></pre>
<h2 id="testing-normalised-traces">Testing Normalised Traces</h2>
<p>Now we need a scheduler which can play a given list of scheduling decisions. This isn’t built in, but we can make one. Schedulers look like this:</p>
<pre class="haskell"><code>-- from Test.DejaFu.Schedule
newtype Scheduler state = Scheduler
  { scheduleThread
    :: Maybe (ThreadId, ThreadAction)
    -&gt; NonEmpty (ThreadId, Lookahead)
    -&gt; state
    -&gt; (Maybe ThreadId, state)
  }</code></pre>
<p>A scheduler is a stateful function, which takes the previously scheduled action and the list of runnable threads, and gives back a thread to execute. We don’t care about those parameters. We just want to play a fixed list of scheduling decisions. And here is how we do that:</p>
<pre class="haskell"><code>-- | Execute a concurrent program by playing a list of scheduling decisions.
play
  :: MemType
  -&gt; [ThreadId]
  -&gt; ConcIO a
  -&gt; IO (Either Failure a, [ThreadId], Trace)
play = runConcurrent (Scheduler sched)
 where
  sched _ _ (t:ts) = (Just t, ts)
  sched _ _ [] = (Nothing, [])</code></pre>
<p>Now all the background is in place, so we can test what we want to test: that an execution, and the play-back of its normalised trace, give the same result. For reasons which will become apparent in the next section, I’m going to parameterise over the normalisation function:</p>
<pre class="haskell"><code>-- | Execute a concurrent program with a random scheduler, normalise its trace,
-- execute the normalised trace, and return both results.
runNorm
  :: ([(ThreadId, ThreadAction)] -&gt; [(ThreadId, ThreadAction)])
  -&gt; Int
  -&gt; MemType
  -&gt; ConcIO a
  -&gt; IO (Either Failure a, [ThreadId], Either Failure a, [ThreadId])
runNorm norm seed memtype conc = do
  let g = mkStdGen seed                                       -- 1
  (efa1, _, trc) &lt;- runConcurrent randomSched memtype g conc
  let                                                         -- 2
    trc&#39; = tail
      ( scanl
        (\(t, _) (d, _, a) -&gt; (tidOf t d, a))
        (initialThread, undefined)
        trc
      )
  let tids1 = map fst trc&#39;
  let tids2 = map fst (norm trc&#39;)                             -- 3
  (efa2, s, _) &lt;- play memtype tids2 conc
  let truncated = take (length tids2 - length s) tids2        -- 4
  pure (efa1, tids1, efa2, truncated)</code></pre>
<p>There’s a lot going on here, so let’s break it down:</p>
<ol type="1">
<li><p>We execute the program with the built-in random scheduler, using the provided seed.</p></li>
<li><p>The trace that <code>runConcurrent</code> gives us is in the form <code>[(Decision,    [(ThreadId, Lookahead)], ThreadAction)]</code>, whereas we want a <code>[(ThreadId, ThreadAction)]</code>. So this scan just changes the format. It’s a scan rather than a map because to convert a <code>Decision</code> into a <code>ThreadId</code> potentially requires knowing what the previous thread was.</p></li>
<li><p>We normalise the trace, and run it again.</p></li>
<li><p>If the entire normalised trace wasn’t used up, then it has some unnecessary suffix (because the main thread is now terminating sooner). So we make the normalised trace easier to read by chopping off any such suffix.</p></li>
</ol>
<p>Finally, we can write a little function to test using the <code>normalise</code> function:</p>
<pre class="haskell"><code>-- | Execute a concurrent program with a random scheduler, normalise its trace,
-- execute the normalised trace, and check that both give the same result.
testNormalise
  :: (Eq a, Show a)
  =&gt; Int
  -&gt; MemType
  -&gt; ConcIO a
  -&gt; IO Bool
testNormalise seed memtype conc = do
  (efa1, tids1, efa2, tids2) &lt;- runNorm normalise seed memtype conc
  unless (efa1 == efa2) $ do
    putStrLn   &quot;Mismatched result!&quot;
    putStrLn $ &quot;      expected: &quot; ++ show efa1
    putStrLn $ &quot;       but got: &quot; ++ show efa2
    putStrLn   &quot;&quot;
    putStrLn $ &quot;rewritten from: &quot; ++ show tids1
    putStrLn $ &quot;            to: &quot; ++ show tids2
  pure (efa1 == efa2)</code></pre>
<p>And does it work? Let’s copy two example programs from the Test.DejaFu docs:</p>
<pre class="haskell"><code>-- from Test.DejaFu
example1
  :: MonadConc m
  =&gt; m String
example1 = do
  var &lt;- newEmptyMVar
  fork (putMVar var &quot;hello&quot;)
  fork (putMVar var &quot;world&quot;)
  readMVar var

example2
  :: MonadConc m
  =&gt; m (Bool, Bool)
example2 = do
  r1 &lt;- newCRef False
  r2 &lt;- newCRef False
  x &lt;- spawn $ writeCRef r1 True &gt;&gt; readCRef r2
  y &lt;- spawn $ writeCRef r2 True &gt;&gt; readCRef r1
  (,) &lt;$&gt; readMVar x &lt;*&gt; readMVar y</code></pre>
<p>And then test them:</p>
<pre><code>&gt; testNormalise 0 TotalStoreOrder example1
True
&gt; testNormalise 0 TotalStoreOrder example2
True</code></pre>
<p>According to my very unscientific method, everything works perfectly!</p>
<h2 id="enter-hedgehog">Enter Hedgehog</h2>
<p>You can probably see where this is going: just supplying <em>one</em> random seed and <em>one</em> memory model is a poor way to test things. Ah, if only we had some sort of tool to generate arbitrary values for us!</p>
<p>But that’s not all: if the dependency relation is correct, then <em>any</em> permutation of independent actions should give the same result, not just the one which <code>normalise</code> implements. So before we introduce <a href="https://hackage.haskell.org/package/hedgehog">Hedgehog</a> and arbitrary values, let’s make something a little more chaotic:</p>
<pre class="haskell"><code>-- | Shuffle independent actions in a trace according to the given list.
shuffle
  :: [Bool]
  -&gt; [(ThreadId, ThreadAction)]
  -&gt; [(ThreadId, ThreadAction)]
shuffle = go initialDepState
 where
  go ds (f:fs) (t1:t2:trc)
    | independent ds t1 t2 &amp;&amp; f = go&#39; ds fs t2 (t1 : trc)
    | otherwise = go&#39; ds fs t1 (t2 : trc)
  go _ _ trc = trc

  go&#39; ds fs t@(tid, ta) trc =
    t : go (updateDepState ds tid ta) fs trc</code></pre>
<p>In <code>normalise</code>, two independent actions will <em>always</em> be re-ordered if it gets us closer to the canonical form. However, in <code>shuffle</code>, two independent actions will either be re-ordered or not, depending on the supplied list of <code>Bool</code>.</p>
<p>This is much better for testing our dependency relation, as we can now get far more re-orderings which <em>all</em> should satisfy the same property: that no matter how the independent actions in a trace are shuffled, we get the same result.</p>
<p>I think it’s about time to bring out Hedgehog:</p>
<pre class="haskell"><code>-- | Execute a concurrent program with a random scheduler, arbitrarily permute
-- the independent actions in the trace, and check that we get the same result
-- out.
hog :: (Eq a, Show a) =&gt; ConcIO a -&gt; IO Bool
hog conc = Hedgehog.check . property $ do
  mem &lt;- forAll Gen.enumBounded                               -- 1
  seed &lt;- forAll $ Gen.int (Range.linear 0 100)
  fs &lt;- forAll $ Gen.list (Range.linear 0 100) Gen.bool

  (efa1, tids1, efa2, tids2) &lt;- liftIO                        -- 2
    $ runNorm (shuffle fs) seed mem conc
  footnote (&quot;            to: &quot; ++ show tids2)                 -- 3
  footnote (&quot;rewritten from: &quot; ++ show tids1)
  efa1 === efa2</code></pre>
<p>Let’s break that down:</p>
<ol type="1">
<li><p>We’re telling Hedgehog that this property should hold for all memory models, all seeds, and all <code>Bool</code>-lists. Unlike most Haskell property-testing libraries, Hedgehog takes generator functions rather than using a typeclass. I think this is nicer.</p></li>
<li><p>We run our program, normalise it, and get all the results just as before.</p></li>
<li><p>We add some footnotes: messages which Hedgehog will display along with a failure. For some reason these get displayed in reverse order.</p></li>
</ol>
<p>Alright, let’s see if Hedgehog finds any bugs for us:</p>
<pre><code>&gt; hog example1
  ? &lt;interactive&gt; failed after 3 tests and 1 shrink.

       ??? extra.hs ???
    82 ? hog :: (Eq a, Show a) =&gt; ConcIO a -&gt; IO Bool
    83 ? hog conc = Hedgehog.check . property $ do
    84 ?   mem &lt;- forAll Gen.enumBounded
       ?   ? SequentialConsistency
    85 ?   seed &lt;- forAll $ Gen.int (Range.linear 0 100)
       ?   ? 0
    86 ?   fs &lt;- forAll $ Gen.list (Range.linear 0 100) Gen.bool
       ?   ? [ False , True ]
    87 ?
    88 ?   (efa1, tids1, efa2, tids2) &lt;- liftIO
    89 ?     $ runNorm (shuffle fs) seed mem conc
    90 ?   footnote (&quot;            to: &quot; ++ show tids2)
    91 ?   footnote (&quot;rewritten from: &quot; ++ show tids1)
    92 ?   efa1 === efa2
       ?   ^^^^^^^^^^^^^
       ?   ? Failed (- lhs =/= + rhs)
       ?   ? - Right &quot;hello&quot;
       ?   ? + Left InternalError

    rewritten from: [main,main,1,main,1,2,main,2,main]
                to: [main,1]

    This failure can be reproduced by running:
    &gt; recheck (Size 2) (Seed 1824012233418733250 (-4876494268681827407)) &lt;property&gt;

False</code></pre>
<p>It did! And look at that output! Magical! I must see if I can get Déjà Fu to give annotated source output like that.</p>
<p>Let’s look at <code>example1</code> again:</p>
<pre class="haskell"><code>do
  var &lt;- newEmptyMVar
  fork (putMVar var &quot;hello&quot;)
  fork (putMVar var &quot;world&quot;)
  readMVar var</code></pre>
<p>Oh dear, our rewritten trace is trying to execute thread <code>1</code> immediately after the first action of the main thread. The first action of the main thread is <code>newEmptyMVar</code>: thread <code>1</code> doesn’t exist at that point!</p>
<p>Let’s change our <code>independent</code> function to say that an action is dependent with the fork which creates its thread:</p>
<pre class="haskell"><code>independent ds (tid1, ta1) (tid2, ta2)
  | ta1 == Fork tid2 = False
  | ta2 == Fork tid1 = False
  | otherwise = not (dependent ds tid1 ta1 tid2 ta2)</code></pre>
<p>How about now?</p>
<pre><code>&gt; hog example1
  ? &lt;interactive&gt; failed after 13 tests and 2 shrinks.

       ??? extra.hs ???
    82 ? hog :: (Eq a, Show a) =&gt; ConcIO a -&gt; IO Bool
    83 ? hog conc = Hedgehog.check . property $ do
    84 ?   mem &lt;- forAll Gen.enumBounded
       ?   ? SequentialConsistency
    85 ?   seed &lt;- forAll $ Gen.int (Range.linear 0 100)
       ?   ? 0
    86 ?   fs &lt;- forAll $ Gen.list (Range.linear 0 100) Gen.bool
       ?   ? [ True , True ]
    87 ?
    88 ?   (efa1, tids1, efa2, tids2) &lt;- liftIO
    89 ?     $ runNorm (shuffle fs) seed mem conc
    90 ?   footnote (&quot;            to: &quot; ++ show tids2)
    91 ?   footnote (&quot;rewritten from: &quot; ++ show tids1)
    92 ?   efa1 === efa2
       ?   ^^^^^^^^^^^^^
       ?   ? Failed (- lhs =/= + rhs)
       ?   ? - Right &quot;hello&quot;
       ?   ? + Left InternalError

    rewritten from: [main,main,1,main,1,2,main,2,main]
                to: [main,1]

    This failure can be reproduced by running:
    &gt; recheck (Size 12) (Seed 654387260079025817 (-6686572164463137223)) &lt;property&gt;

False</code></pre>
<p>Well, that failing trace looks exactly like the previous error. But the parameters are different: the first error happened with the list <code>[False, True]</code>, this requires the list <code>[True, True]</code>. So let’s think about what happens to the trace in this case.</p>
<ol type="1">
<li><p>We start with: <code>[(main, NewEmptyMVar 0), (main, Fork 1), (1,    PutMVar 0)]</code>.</p></li>
<li><p>The first two actions are independent, and the flag is <code>True</code>, so we swap them. We now have: <code>[(main, Fork 1), (main, NewEmptyMVar    1), (1, PutMVar 0)]</code>.</p></li>
<li><p>The second two actions are independent, and the flag is <code>True</code>, so we swap them. We now have: <code>[(main, Fork 1), (1, PutMVar 0),    (main, NewEmptyMVar 0)]</code>.</p></li>
</ol>
<p>We can’t actually re-order actions of the same thread, so we should never have swapped the first two. I suppose there’s another problem here, that no action on an <code>MVar</code> commutes with creating that <code>MVar</code>, but we should never be in a situation where that could happen. So we need another case in <code>independent</code>:</p>
<pre class="haskell"><code>independent ds (tid1, ta1) (tid2, ta2)
  | tid1 == tid2 = False
  | ta1 == Fork tid2 = False
  | ta2 == Fork tid1 = False
  | otherwise = not (dependent ds tid1 ta1 tid2 ta2)</code></pre>
<p>Our first example program works fine now:</p>
<pre><code>&gt; hog example1
  ? &lt;interactive&gt; passed 100 tests.
True</code></pre>
<p>The second is a little less happy:</p>
<pre><code>&gt; hog example2
  ? &lt;interactive&gt; failed after 48 tests and 9 shrinks.

       ??? extra.hs ???
    82 ? hog :: (Eq a, Show a) =&gt; ConcIO a -&gt; IO Bool
    83 ? hog conc = Hedgehog.check . property $ do
    84 ?   mem &lt;- forAll Gen.enumBounded
       ?   ? TotalStoreOrder
    85 ?   seed &lt;- forAll $ Gen.int (Range.linear 0 100)
       ?   ? 0
    86 ?   fs &lt;- forAll $ Gen.list (Range.linear 0 100) Gen.bool
       ?   ? [ False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , False
       ?   ? , True
       ?   ? ]
    87 ?
    88 ?   (efa1, tids1, efa2, tids2) &lt;- liftIO
    89 ?     $ runNorm (shuffle fs) seed mem conc
    90 ?   footnote (&quot;            to: &quot; ++ show tids2)
    91 ?   footnote (&quot;rewritten from: &quot; ++ show tids1)
    92 ?   efa1 === efa2
       ?   ^^^^^^^^^^^^^
       ?   ? Failed (- lhs =/= + rhs)
       ?   ? - Right ( False , True )
       ?   ? + Left InternalError

    rewritten from: [main,main,main,main,main,1,-1,1,1,main,1,main,main,main,main,2,-1,2,2,main,main]
                to: [main,main,main,main,main,1,-1,1,1,main,1,main,main,main,2,main,-1]

    This failure can be reproduced by running:
    &gt; recheck (Size 47) (Seed 2159662051602767058 (-7857629802164753123)) &lt;property&gt;

False</code></pre>
<p>This is a little trickier. Here’s my diagnosis:</p>
<ol type="1">
<li><p>It’s an <code>InternalError</code> again, which means we’re trying to execute a thread which isn’t runnable.</p></li>
<li><p>The memory model is <code>TotalStoreOrder</code>, and the thread we’re trying to execute is thread <code>-1</code>, a “fake” thread used in the relaxed memory implementation. So this is a relaxed memory bug.</p></li>
<li><p>The traces only differ in one place: where <code>main, 2, -1</code> is changed to <code>2, main, -1</code>. So the issue is caused by re-ordering <code>main</code> and thread <code>2</code>.</p></li>
<li><p>If the <code>main</code> action is a memory barrier, then thread <code>-1</code> will not exist after it.</p></li>
<li><p>So the <code>main</code> action is probably a memory barrier.</p></li>
</ol>
<p>Let’s push along those lines and add a case for memory barriers to <code>independent</code>:</p>
<pre class="haskell"><code>independent ds (tid1, ta1) (tid2, ta2)
  | tid1 == tid2 = False
  | ta1 == Fork tid2 = False
  | ta2 == Fork tid1 = False
  | otherwise = case (simplifyAction ta1, simplifyAction ta2) of
      (UnsynchronisedWrite _, a) | isBarrier a -&gt; False
      (a, UnsynchronisedWrite _) | isBarrier a -&gt; False
      _ -&gt; not (dependent ds tid1 ta1 tid2 ta2)</code></pre>
<p>Did we get it?</p>
<pre><code>&gt; hog example2
  ? &lt;interactive&gt; passed 100 tests.
True</code></pre>
<p>Great!</p>
<h2 id="bugs">Bugs?</h2>
<p>So, we explored the dependency relation with Hedgehog, and found three missing cases:</p>
<ol type="1">
<li><p>Two actions of the same thread are dependent.</p></li>
<li><p>Any action of a thread is dependent with the <code>fork</code> which creates that thread.</p></li>
<li><p>Unsynchronised writes are dependent with memory barriers.</p></li>
</ol>
<p>But are these <em>bugs</em>? I’m not so sure:</p>
<ol type="1">
<li><p>The dependency relation is only ever used to compare different threads.</p></li>
<li><p>This is technically correct, but it’s not interesting or useful.</p></li>
<li><p>This could be a bug. The relaxed memory implementation is pretty hairy and I’ve had a lot of problems with it in the past. Honestly, I just need to rewrite it (or campaign for Haskell to become sequentially consistent<a href="hedgehog-dejafu.html#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a> and rip it out).</p></li>
</ol>
<p>But even if not bugs, these are definitely <em>confusing</em>. The dependency relation is currently just an internal thing, not exposed to users. However, I’m planning to expose a function to normalise traces, in which case providing an <code>independent</code> function is entirely reasonable.</p>
<p>So even if these changes don’t make it into <code>dependent</code>, they will be handled by <code>independent</code>.</p>
<p><strong>Next steps:</strong> I’m going to get this into the test suite, to get a large number of extra example programs for free. My hacky and cobbled-together testing framework in dejafu-tests is capable of running every test case with a variety of different schedulers, so I just need to add another way it runs everything. I won’t need to touch the actual tests, just the layer of glue which runs them all, which is nice.</p>
<p>The only problem is that this glue is currently based on <a href="https://hackage.haskell.org/package/HUnit">HUnit</a> and <a href="https://hackage.haskell.org/package/test-framework">test-framework</a>, whereas the only integration I can find for Hedgehog is <a href="https://hackage.haskell.org/package/tasty-hedgehog">tasty-hedgehog</a>, so I might need to switch to <a href="https://hackage.haskell.org/package/tasty">tasty</a> first. As usual, the hardest part is getting different libraries to co-operate!</p>
<p>Hopefully I’ll find some bugs! Well, not exactly <em>hopefully</em>, but you know what I mean.</p>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p>For all the gory details, see:</p>
<ul>
<li><p><strong>Dynamic partial order reduction for relaxed memory models</strong>, N. Zhang, M. Kusano, and C. Wang (2015)</p></li>
<li><p><strong>Bounded partial-order reduction</strong>, K. Coons, M. Musuvathi, and K. McKinley (2013)</p></li>
<li><p><strong>Refining dependencies improves partial-order verification methods</strong> (extended abstract), P. Godefroid and D. Pirottin (1993)</p></li>
</ul>
<a href="hedgehog-dejafu.html#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></li>
<li id="fn2"><p><strong>SC-Haskell: Sequential Consistency in Languages That Minimize Mutable Shared Heap</strong>, M. Vollmer, R. G. Scott, M. Musuvathi, and R. R. Newton (2017)<a href="hedgehog-dejafu.html#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Do Developers Update Their Library Dependencies?</title>
    <link href="https://memo.barrucadu.co.uk/do-developers-update-their-library-dependencies.html" />
    <id>https://memo.barrucadu.co.uk/do-developers-update-their-library-dependencies.html</id>
    <published>2017-12-12T00:00:00Z</published>
    <updated>2017-12-12T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>By <a href="http://sel.ist.osaka-u.ac.jp/people/raula-k/">Raula Gaikovina Kula</a>, <a href="http://turingmachine.org/">Daniel M. German</a>, <a href="http://ouniali.github.io/index.html">Ali Ouni</a>, <a href="http://sel.ist.osaka-u.ac.jp/people/ishio/index.html.en">Takashi Ishio</a>, and <a href="http://sel.ist.osaka-u.ac.jp/people/inoue/">Katsuro Inoue</a>.<br> In <em>Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering</em> (ESEC/FSE). 2017.<br> <a href="https://link.springer.com/article/10.1007/s10664-017-9521-5">Paper</a> / <a href="http://esec-fse17.uni-paderborn.de/">Conference</a></p>
<p>The <a href="how-to-break-an-api.html">third</a> and final <a href="why-do-developers-use-trivial-packages.html">paper</a> in this mini-series on dependencies taken from recent ESEC/FSE events. This study looks solely at Java projects using Maven, and investigates how the <a href="http://cve.mitre.org/">Common Vulnerabilities and Exposures</a> (CVE) project affects library migration. The authors looked at 4,659 projects, performed 8 case studies, and surveyed developers of projects with outdated (and vulnerable) dependencies.</p>
<p>The starting assumption is that developers don’t want dependencies with known security vulnerabilities:</p>
<blockquote>
<p>We conjecture that for developers, the awareness of the security advisory is more important than the migration effort needed to migrate the vulnerable dependency.</p>
</blockquote>
<p>But this reasonable-sounding assumption doesn’t seem to be backed up by the evidence:</p>
<blockquote>
<p>In 2014, <a href="https://nvd.nist.gov/vuln/detail/CVE-2014-0160">Heartbleed</a>, <a href="https://nvd.nist.gov/vuln/detail/CVE-2014-3566">Poodle</a>, <a href="https://nvd.nist.gov/vuln/detail/CVE-2014-6271">Shellshock</a>, — all high profile library vulnerabilities were found to have affected a significant portion of the software industry. In that same year, Sonatype determined that over 6% of the download requests from the Maven Central repository were for component versions that included known vulnerabilities. The company reported that in review of over 1,500 applications, each of them had an average of 24 severe or critical flaws inherited from their components.</p>
</blockquote>
<p>So given that many developers don’t appear to consider these issues severe enough to merit upgrading their dependencies (or are unaware of the issues entirely), three research questions are formulated:</p>
<ul>
<li>To what extent are developers updating their library dependencies?</li>
<li>What is the response to important awareness mechanisms such as a new release announcement and a security advisory on library updates?</li>
<li>Why are developers non responsive to a security advisory?</li>
</ul>
<p><strong>Library usage</strong> To track library migrations, the authors first define what exactly a migration is. They use <em>L(name, version)</em> to refer to a library and <em>S(name, version)</em> to refer to a system. When a system <em>S(a,b)</em> migrates to a library <em>L(x,y)</em>, it creates a dependency between them.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/library_migration.png" alt="Library migration between systems and libraries" />
<figcaption aria-hidden="true">Library migration between systems and libraries</figcaption>
</figure>
<p>This model can track how frequently developers migrate their dependencies, and how many migrations occurred in a single system version update. The authors also define “library usage” as the number of systems depending on a library at a given point in time.</p>
<p>The authors selected 4,659 Java projects (filtered down from 10,523) from GitHub which (a) have more than 100 commits; (b) have a commit between January and November of 2015; (c) are not duplicates (determined by project name); and (d) use Maven. As a project may contain multiple systems, the authors then extracted 48,495 systems from the 4,659 projects, and found 852,322 total migrations.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/beanutils_migration.png" alt="Library migration for L(beanutils,1.9.1) and L(beanutils,1.9.2)" />
<figcaption aria-hidden="true">Library migration for L(beanutils,1.9.1) and L(beanutils,1.9.2)</figcaption>
</figure>
<p>For example, here we see a library migration plot for the Apache Commons <a href="http://commons.apache.org/proper/commons-beanutils/">beanutils</a> library, for which <a href="https://nvd.nist.gov/vuln/detail/CVE-2014-0114">CVE-2014-0114</a> was published in April 2014 (the dashed black line). Don’t read too much into the curves:</p>
<blockquote>
<p>The LMP shows LU changes in the library (y-axis) with respect to time (x-axis). It is important to note that the LMP curve itself should not be taken at face value, as the smoothing algorithm is generated by a predictive model and it is not a true reflection of all data points.</p>
</blockquote>
<p>The authors use these library migration plots to judge how developers respond to announcements of new releases and CVEs. The authors pick eight libraries in particular to examine, based on library usage trends. For each, they looked at the online documentation and version numbering to judge the effort required to perform the migration. The releases selected are:</p>
<ul>
<li><a href="https://github.com/google/guava">google-guava</a> (16.0.1, 17.0, and 18.0)</li>
<li><a href="http://junit.org">junit</a> (3.8.1, 4.10, 4.11)</li>
<li><a href="https://logging.apache.org/log4j/">log4j</a> (1.2.15, 1.2.16, 1.2.17)</li>
<li><a href="http://commons.apache.org/proper/commons-beanutils/">commons-beanutils</a> (1.9.1, 1.9.2)</li>
<li><a href="http://commons.apache.org/proper/commons-fileupload/">commons-fileupload</a> (1.2.2, 1.3, 1.3.1)</li>
<li><a href="http://hc.apache.org/httpclient-3.x/">commons-httpclient</a> (3.1, 4.2.2)</li>
<li><a href="https://hc.apache.org/">httpcomponents</a> (4.2.2, 4.2.3, 4.2.5)</li>
<li><a href="https://commons.apache.org/proper/commons-compress/">commons-compress</a> (1.4, 1.4.1)</li>
</ul>
<p>Finally, the authors send a short email survey to developers of projects which are non-responsive to a CVE, asking if they are aware of the vulnerability, and why they haven’t updated. They received a total of 16 responses.</p>
<p><strong>Library migration in practice</strong> Firstly the authors look at this from a systems perspective.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/systems_libraries_upgrades.png" alt="The dependencies and updates of systems" />
<figcaption aria-hidden="true">The dependencies and updates of systems</figcaption>
</figure>
<p>We discover that systems tend to have a lot of library dependencies, but do not perform many dependency updates (“DUs”) when they make a new release. We also see that there is just about no correlation between the number of library dependencies and the number of dependency updates.</p>
<blockquote>
<p>This result confirms the hypothesis that the number of library dependencies in a system does not influence the frequency of updates.</p>
</blockquote>
<p>The authors then look at this from a library perspective, and find that library versions tend to slowly reach peak usage, which then steadily declines as developers migrate away. However, many systems remain with outdated dependencies, such as <em>L(log4j, 1.2.15)</em>, with 98% of systems having not migrated away from it at the time of the study.</p>
<blockquote>
<p>To answer (RQ1): (i) although system heavily depend on libraries, most systems rarely update their libraries and (ii) systems are less likely migrate their library dependencies, with 81.5% of systems remaining with a popular older version.</p>
</blockquote>
<p><strong>Developer responsiveness to awareness mechanisms</strong> The authors examine library usage trends to judge how developers respond to awareness mechanisms such as new releases and CVEs.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/google-guava_lu_trends.png" alt="Library usage trends for consecutive releases of google-guava" />
<figcaption aria-hidden="true">Library usage trends for consecutive releases of google-guava</figcaption>
</figure>
<p>The authors speculate that, in this case, migrating between google-guava versions is fairly easy, which influences the quick change:</p>
<blockquote>
<p>We find that the reasons for consistent migration trends are mainly related to the estimated migration effort required to complete the migration process. Through inspection of the online documentation, we find that migration from L(NR1, 16.0.1) to L(NR1, 17.0) contains 10 changed packages. Similarly, migration from L(NR1, 17.0) to L(NR1, 18.0) also contained 7 changed packages. Yet, all three library versions require the same Java 5 environment which indicates no significant changes to the overall architectural design of the library. From the documentation, we deduce that popular use of L(NR1, 18.0) is due to the prolonged period between the next release of L(NR1, 19.0), which is more that a year after the release of L(NR1, 18.0) in December 10, 2015</p>
</blockquote>
<figure>
<img src="do-developers-update-their-library-dependencies/junit_lu_trends.png" alt="Library usage trends for consecutive releases of junit" />
<figcaption aria-hidden="true">Library usage trends for consecutive releases of junit</figcaption>
</figure>
<p>For junit, migration is more challenging, which may contribute to the prolonged usage of older versions.</p>
<blockquote>
<p>Similar to the consistent migration to a new release, we find that the reason for a non response to a migration opportunity is related to the estimated migration effort. For instance, as shown in Figure 8(b), the newer Junit version 4 series libraries requires a change of platform to Java 5 or higher (L(NR2, 4.10) and L(NR2, 4.11)), inferring significant changes to the architectural design of the library. Intuitively, we see that even though L(NR2, 3.8.1) is older, it still maintains its maximum library usage (i.e., current LU and peak LU=342).</p>
</blockquote>
<p>As new releases are made, library usage begins to gradually trend down. This seems reasonable: having newer dependencies is nice, but not essential.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/commons-beanutils_lu_trends.png" alt="Library usage trends for consecutive releases of commons-beanutils" />
<figcaption aria-hidden="true">Library usage trends for consecutive releases of commons-beanutils</figcaption>
</figure>
<p>For vulnerabilities, however, we would like to see a much sharper decline. The commons-beanutils library shows this pattern. A vulnerability is announced, a release is made shortly afterwards, and the new version rapidly gains in popularity. Developers even appear to begin migrating away from the vulnerable version before the new one is released, in this case.</p>
<figure>
<img src="do-developers-update-their-library-dependencies/commons-httpclient_lu_trends.png" alt="Library usage trends for commons-httpclient and httpcomponents" />
<figcaption aria-hidden="true">Library usage trends for commons-httpclient and httpcomponents</figcaption>
</figure>
<p>Unfortunately, it’s not always the case. Here we see that a vulnerability was announced in the commons-httpclient library, but its usage kept growing. The authors speculate that this is because the migration effort was too high: there was no new release of commons-httpclient, rather the library was deprecated in favour of the new httpcomponents library:</p>
<blockquote>
<p>The estimated migration effort and the lack of a viable replacement dependency are some of the possible reasons why affected maintainers show no response to the security advisory. This is shown in the case of the Httpcomponents library, which is the successor and replacement for commons-httpclient library. As documented, Httpcomponents is a major upgrade with many architectural design modifications compared to the older commons-httpclient dependency versions.</p>
</blockquote>
<p><strong>Developer feedback on vulnerable dependencies</strong> Finally, we get the results of the survey. Of the 16 responses, 11 (69%) were unaware that there was a vulnerability at all! However, in some cases, vulnerable dependencies are not exposed in a way which introduces a security hole at all. One developer noted:</p>
<blockquote>
<p>It’s only a test scoped dependency which means that it’s not a transitive dependency for users of XXX so there is no harm done. XXX has no external compile scoped dependencies thus there is no real need to update dependencies.</p>
</blockquote>
<p>Some developers seem to view upgrading dependencies as a luxury which they can’t afford:</p>
<blockquote>
<p>I subscribed to the CVE RSS recently and I don’t check it regularly, so even if I might have heard of the current vulnerability, I simply forgot to address it. We also had some emergencies recently (developing features for our customers), that makes the security issues less prio than releasing the ordered features :-/ … Anyway, our security approach is far from perfect, I am aware of it, and I’m willing to improve this, but sometimes it is difficult to explain our customers that it is a main point to consider in the development process.</p>
</blockquote>
<p>Unfortunately, customers and users are often unsympathetic to things without an immediate impact. If a security hole isn’t causing problems now, even if it might in the future, then they want new features rather than better security.</p>
<p><strong>Dependencies are hard!</strong> Library usage is not only common, but encouraged as good practice. Yet 81% of the systems the authors surveyed use outdated dependencies. Even when there is a published security issue, developers often do not migrate. Updating dependencies is considered something nice to do in your spare time, but not really a focus. This is not a great situation.</p>
<blockquote>
<p>The study provides motivation for our community develop strategies to improve a developer personal perception of third-party updates, especially in cases when effort must be allocated to mitigate a severe vulnerability risk. Visual aids such as the Library Migration Plots (LMP) provide a rich visual analysis, which proves to be a useful awareness and motivation for developers to identify dependency migration opportunities. We envision this work as a contribution toward developing strategies and support tools that aid the management of third-party dependencies.</p>
</blockquote>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Why Do Developers Use Trivial Packages?</title>
    <link href="https://memo.barrucadu.co.uk/why-do-developers-use-trivial-packages.html" />
    <id>https://memo.barrucadu.co.uk/why-do-developers-use-trivial-packages.html</id>
    <published>2017-12-04T00:00:00Z</published>
    <updated>2017-12-04T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>By <a href="http://das.encs.concordia.ca/members/rabe-abdalkareem/">Rabe Abdalkareem</a>, Olivier Nourry, <a href="http://das.encs.concordia.ca/members/sultan-wehaibi/">Sultan Wehaibi</a>, <a href="http://das.encs.concordia.ca/members/suhaib-mujahid/">Suhaib Mujahid</a>, and <a href="http://das.encs.concordia.ca/members/emad-shihab/">Emad Shihab</a>.<br> In <em>Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering</em> (ESEC/FSE). 2017.<br> <a href="https://dl.acm.org/citation.cfm?id=3106267">Paper</a> / <a href="http://esec-fse17.uni-paderborn.de/">Conference</a></p>
<p>We saw <a href="how-to-break-an-api.html">last time</a> that developers are often wary of introducing new dependencies unless they’re really worth it, due to the inevitable cost of maintenance. Why then do developers also depend on so-called “trivial packages”? The <a href="http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm">left-pad</a> <a href="https://www.theregister.co.uk/2016/03/23/npm_left_pad_chaos/">fiasco</a> of <a href="https://medium.com/quid-pro-quo/what-should-we-learn-from-the-left-pad-gate-5a553307a742">last</a> <a href="http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how-to-program/">year</a> brought to light how extreme this situation really is: a package providing 11 lines of code to left pad a string was pulled from npm, breaking thousands of other packages which, directly or indirectly, depended on it.</p>
<p>This is the question which this survey paper sets out to answer. Firstly we get some quantitative analysis of trivial package use across 230,000 npm packages and 38,000 applications, then a survey with 88 Node.js developers trivial packages.</p>
<p><strong>What do we mean by a “trivial package”?</strong> The authors randomly selected 16 npm packages with between 4 and 250 lines of code and sent out a survey, which got 12 responses, asking whether each package was trivial or not, and why. Here’s an example, the <a href="https://www.npmjs.com/package/is-positive">is-positive</a> package:</p>
<pre class="javascript"><code>module.exports = function (n) {
  return toString.call(n) === &#39;[object Number]&#39; &amp;&amp; n &gt; 0;
};</code></pre>
<p>Based on the survey responses, the authors identified both length and cyclomatic complexity of a package to be contributing factors to its triviality:</p>
<blockquote>
<p>Our survey indicates that size and complexity are commonly used measures to determine if a package is trivial. Based on our analysis, packages that have ≤ 35 JavaScript LOC and a McCabe’s cyclomatic complexity ≤ 10 are considered to be trivial.</p>
</blockquote>
<p>You can quibble over this definition (I might consider a longer but low-complexity package to be trivial, for instance), but triviality is ultimately a judgement call. No matter what metric the authors pick, there will be some who disagree.</p>
<p><strong>How prevalent are they?</strong> The authors fetched the latest version of every npm package as of the 5th of May 2016, giving 231,092 packages, after removing 21,904 with no code. They also fetched all Node.js/npm applications on GitHub, giving 38,807 applications, after filtering out 76,814 with fewer than 100 commits or only one developer.</p>
<figure>
<img src="why-do-developers-use-trivial-packages/percentage_of_trivial_packages.png" alt="Percentage of Published Trivial Packages on npm" />
<figcaption aria-hidden="true">Percentage of Published Trivial Packages on npm</figcaption>
</figure>
<p>Of the npm packages, an incredible 28,845 (16.8%) are trivial packages. Furthermore, if we look at the proportion of published trivial packages over time, we see that it’s going up! This graph is jagged, up until npm banned unpublishing packages in response to the left-pad incident. I suspect this means that a lot of people used to publish, and then almost immediately remove, trivial packages. Currently, roughly 15% of the packages added each month are trivial packages.</p>
<p>Rather than looking at the entire database of packages, we can also look at the most popular:</p>
<blockquote>
<p>npm posts the most depended-upon packages on its website. We measured the number of trivial packages that exist in the top 1,000 most depended-upon packages; we find that 113 of them are trivial packages. This finding shows that trivial packages are not only prevalent and increasing in number, but they are also very popular among developers, making up 11.3% of the 1,000 most depended on npm packages.</p>
</blockquote>
<p>When it comes to applications, the authors parsed the source code, looking for import statements, to handle cases where a project’s package.json file (containing metadata for npm to build and run it) specifies a dependency which isn’t used anywhere. This gives, for each application, a set of dependencies which are used:</p>
<blockquote>
<p>Finally, we measured the number of packages that are trivial in the set of packages used by the applications. Note that we only consider npm packages since it is the most popular package manager for Node.js packages and other package managers only manage a subset of packages. We find that of the 38,807 applications in our data set, 4,256 (10.9%) directly depend on at least one trivial package.</p>
</blockquote>
<p><strong>How do developers feel about them?</strong> Given how popular trivial packages are, we might suspect that developers don’t consider them a problem. This is in sharp contrast to some viewpoints in <a href="how-to-break-an-api.html">How to Break an API</a>, where developers were wary of introducing new dependencies. This part of the study was conducted as a survey of 88 developers.</p>
<p>The reasons given are:</p>
<ul>
<li>Trivial packages provide well implemented and tested code (48 respondents)</li>
<li>Use of trivial packages increases productivity (42 respondents)</li>
<li>Use of trivial packages outsources the maintenance burden for that code to the package authors (8 respondents)</li>
<li>Use of trivial packages helps readability and reduces complexity (8 respondents)</li>
<li>Use of a trivial package, over a large library or framework, improves application performance (3 respondents)</li>
</ul>
<p>Only 7 respondents said they saw no reason to use trivial packages.</p>
<p>The authors also asked for the drawbacks of using trivial packages. Now we get some viewpoints closer to How to Break an API. The drawbacks given are:</p>
<ul>
<li>The overhead of monitoring dependencies for updates (49 respondents)</li>
<li>The maintenance burden of breaking changes (16 respondents)</li>
<li>Decreased build performance, due to the overhead of fetching and building more dependencies (14 respondents)</li>
<li>Decreased developer performance, due to needing to read more documentation (11 respondents)</li>
<li>A missed learning opportunity: it’s easier to use a package to solve a problem than to figure it out yourself (8 respondents)</li>
<li>Potential security risks in third-party code (7 respondents)</li>
<li>Licensing issues (3 respondents)</li>
</ul>
<p>Only 7 respondents said they saw no drawbacks to using trivial packages.</p>
<p><strong>Are they well tested?</strong> Over half of the respondents said that a reason to use trivial packages is that the code is perceived to be well implemented and tested. But is that really the case?</p>
<blockquote>
<p>npm requires that developers provide a test script name with the submission of their packages (listed in the package.json file). In fact, 81.2% (31,521 out of 38,845) of the trivial packages in our dataset have some test script name listed. However, since developers can provide any script name under this field, it is dificult to know if a package is actually tested.</p>
</blockquote>
<p>So the authors turn to the <a href="https://npms.io/">npms</a> tool to collect metrics about the trivial packages in their dataset:</p>
<blockquote>
<p>We examine whether a package is really well tested and implemented from two aspects; first, we check if a package has tests written for it. Second, since in many cases, developers consider packages to be ‘deployment tested’, we also consider the usage of a package as an indicator of it being well tested and implemented. To carefully examine whether a package is really well tested and implemented, we use the npm online search tool (known as npms) to measure various metrics related to how well the packages are tested, used and valued. To provide its ranking of the packages, npms mines and calculates a number of metrics based on development (e.g., tests) and usage (e.g., no. of downloads) data.</p>
</blockquote>
<p>They used three npms metrics to evaluate how tested a package is:</p>
<ul>
<li>“Tests”, a weighted sum of the size of the tests, the coverage percentage, and the build status</li>
<li>“Community interest”, derived from popularity on GitHub</li>
<li>“Download count”, the number of downloads in the last three months</li>
</ul>
<p>The results are not so promising:</p>
<blockquote>
<p>As an initial step, we calculate the number of trivial packages that have a Tests value greater than zero, which means trivial packages that have some of tests. We find that only 45.2% of the trivial packages have tests, i.e., a Tests value &gt; 0.</p>
</blockquote>
<p>So much for well tested!</p>
<figure>
<img src="why-do-developers-use-trivial-packages/trivial_vs_nontrivial_metrics.png" alt="Distribution of Tests, Community Interest, and Download Count metrics" />
<figcaption aria-hidden="true">Distribution of Tests, Community Interest, and Download Count metrics</figcaption>
</figure>
<p>The authors also compare the metrics of trivial packages with nontrivial packages. We see that the distributions are similar, though nontrivial packages have a greater median, which could easily be due to the size and complexity difference. The authors find that the differences are statistically significant, but with small effect size.</p>
<p><strong>How much effort is needed to keep up with new releases?</strong> The most cited drawback for using trivial packages was the extra overhead of needing to keep everything up-to-date.</p>
<figure>
<img src="why-do-developers-use-trivial-packages/trivial_vs_nontrivial_releases.png" alt="Number of Releases for Trivial Packages Compared to Nontrivial Packages" />
<figcaption aria-hidden="true">Number of Releases for Trivial Packages Compared to Nontrivial Packages</figcaption>
</figure>
<p>There are a couple of ways to look at the impact of dependencies. Firstly, the authors compare the number of releases. Trivial packages tend to have fewer releases, so it seems that if you’re going to have a dependency, from a purely maintenance perspective, a trivial dependency is the better option.</p>
<blockquote>
<p>The fact that the trivial packages are updated less frequently may be attributed to the fact that trivial packages ‘perform less functionality’, hence they need to be updated less frequently</p>
</blockquote>
<figure>
<img src="why-do-developers-use-trivial-packages/trivial_vs_nontrivial_dependencies.png" alt="Distribution of Direct &amp; Indirect Dependencies for Trivial and Nontrivial Packages" />
<figcaption aria-hidden="true">Distribution of Direct &amp; Indirect Dependencies for Trivial and Nontrivial Packages</figcaption>
</figure>
<p>Next the authors consider how many dependencies (direct and indirect) trivial and nontrivial packages have. Introducing extra dependencies increases the complexity of the dependency chain, so all else being equal, we would prefer to have fewer dependencies.</p>
<p>The authors group packages into four categories by number of dependencies:</p>
<ul>
<li>0: 56.3% of trivial packages, 34.8% of nontrivial packages</li>
<li>1–10: 27.9% of trivial packages, 30.6% of nontrivial packages</li>
<li>11–20: 4.3% of trivial packages, 7.3% of nontrivial packages</li>
<li>More: 11.5% of trivial packages, 27.3% of nontrivial packages</li>
</ul>
<p>So developers should beware extra dependencies! Even though the source of a trivial package may be small, it may pull in many additional packages!</p>
<blockquote>
<p>Trivial packages have fewer releases and developers are less likely to be version locked than non-trivial packages. That said, developers should be careful when using trivial packages, since in some cases, trivial packages can have numerous dependencies. In fact, we find that 43.7% of trivial packages have at least one dependency and 11.5% of trivial packages have more than 20 dependencies.</p>
</blockquote>
<p><strong>The bottom line</strong> The final sentence of the paper is short, snappy, and neatly summarises all of what came before:</p>
<blockquote>
<p>Hence, developers should be careful about which trivial packages they use.</p>
</blockquote>
<p>It probably goes without saying, but I would apply this warning to <em>all</em> packages, trivial and nontrivial.</p>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>How to Break an API</title>
    <link href="https://memo.barrucadu.co.uk/how-to-break-an-api.html" />
    <id>https://memo.barrucadu.co.uk/how-to-break-an-api.html</id>
    <published>2017-11-30T00:00:00Z</published>
    <updated>2017-11-30T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>By <a href="http://chris.bogarthome.net/">Christopher Bogart</a>, <a href="https://www.cs.cmu.edu/~ckaestne/">Christian Kästner</a>, <a href="http://herbsleb.org/">James Herbsleb</a>, and <a href="https://sites.google.com/site/ferdianthung/">Ferdian Thung</a>.<br> In <em>Foundations of Software Engineering</em> (FSE). 2016.<br> <a href="https://dl.acm.org/citation.cfm?id=2950325">Paper</a> / <a href="http://www.cs.ucdavis.edu/fse2016/">Conference</a> / <a href="http://breakingapis.org/">Project</a></p>
<p>I’ve recently discovered the world of empirical studies of software engineering practices, and like what I see. The few papers I’ve read seem to confirm the conventional wisdom of what “everybody knows”, but it’s nice to see these thoughts backed up by data.</p>
<p>This study looks at three different ecosystems with different approaches to API breakage: the very stable <a href="https://marketplace.eclipse.org/">Eclipse Marketplace</a>, the consistent snapshot approach of <a href="https://cran.r-project.org/">CRAN</a>, and the semantic versioning approach of <a href="https://www.npmjs.com/">npm</a>. An ecosystem is more than a collection of packages, it’s also a group of people, with cultural norms about stability and change.</p>
<blockquote>
<p>How, when, and by whom changes are performed in an ecosystem with interdependent packages is subject to (often implicit) negotiation among diverse participants within the ecosystem. Each participant has their own priorities, habits and rhythms, often guided by community-specific values and policies, or even enforced or encouraged by tools. Ecosystems differ in, for example, to what degree they require consistency among packages, how they handle versioning, and whether there are central gatekeepers. Policies and tools are in part designed explicitly, but in part emerge from ad-hoc decisions or from values shared by community members. As a result, community practices may assign burdens of work in ways that create unanticipated conflicts or bottlenecks.</p>
</blockquote>
<p>The paper looks at the issue of API breakage from the perspective of both library authors (those doing the breaking) and library users (those who need to modify their code). The results come from a case study of 28 open source developers across the three ecosystems. This doesn’t seem like a lot, but that’s inevitable for survey papers.</p>
<p>Firstly we get an overview of the policies of each ecosystem. They’re very different:</p>
<blockquote>
<p>A core value of the Eclipse community is backward compatibility. This value is evident in many policies, such as “API Prime Directive: When evolving the Component API from release to release, do not break existing Clients”.</p>
</blockquote>
<blockquote>
<p>CRAN pursues snapshot consistency in which the newest version of every package should be compatible with the newest version of every other package in the repository. Older versions are “archived”: available in the repository, but harder to install. […] A core value of the R/CRAN community is to make it easy for end users to install and update packages.</p>
</blockquote>
<blockquote>
<p>A core value of the Node.js/npm community is to make it easy and fast for developers to publish and use packages. In addition, the community is open to rapid change. […] The focus on convenience for developers (instead of end users) was apparent in our interviews.</p>
</blockquote>
<p>Stability. Snapshot consistency. Ease of development. Nobody will use a library that breaks its API every week, but there is clearly a sliding scale of how much breakage is tolerated.</p>
<p>This paper was interesting to me because I’m most familiar with the <a href="https://hackage.haskell.org/">Hackage</a> and <a href="https://www.stackage.org/">Stackage</a> models, and it didn’t take long for me to see parallels between the Haskell world and other ecosystems. Hackage is more like npm, with the <a href="https://pvp.haskell.org/">PVP</a> in Haskell serving the role of <a href="https://semver.org/">semver</a> in npm; and Stackage is more like CRAN. The project website has some analysis of Hackage and Stackage, which I think lends credence to this:</p>
<blockquote>
<p>Stackage stands out as particularly valuing of compatibility; this is not too surprising since it was formed over as an alternative to Hackage with the specific goal to identify mutually compatible versions of packages to use together.</p>
</blockquote>
<p>The reasons for library authors to consider a breaking API change mostly line up with what I would have expected:</p>
<ul>
<li>Technical debt</li>
<li>Efficiency</li>
<li>Bugs</li>
</ul>
<p>Funnily enough, fixing bugs isn’t always a good thing for the users:</p>
<blockquote>
<p>Throughout our interviews, we heard many examples of how bug fixes effectively broke downstream packages, and the difficulty of knowing in advance which fixes would cause such problems. For example, R7 told us about reimplementing a standard string processing function, and finding that it broke the code of some downstream users that depended on bugs that his tests had not caught. R9 commented on the opportunity cost of not fixing a bug in deference to downstream users’ workarounds for it: “If the [downstream package] is implemented on the workaround for your bug, and then your fix actually breaks the workaround, then you sort of have to have a fallback… [pause] It gets nasty.”</p>
</blockquote>
<p>This puts me in mind of Microsoft, who are famous for never breaking backwards compatibility and just introducing new APIs when they have a better way of doing something. I wouldn’t want to maintain their behemoth of a codebase!</p>
<p>Library authors don’t like to break things for their users, but for CRAN package authors this is perhaps a greater concern than usual:</p>
<blockquote>
<p>Two interviewees (E1 and R4) specifically mentioned concern for downstream users’ scientific research (R4: “We’re improving the method, but results might change, so that’s also worrying — it makes it hard to do reproducible research”).</p>
</blockquote>
<p>But some library authors don’t care so much:</p>
<blockquote>
<p>Only a few developers were not particularly worried about breaking changes. Some (E6, N1, N5) had strong ties to their users and felt they could help them individually (N5: “We try to avoid breaking their code — but it’s easy to update their code”). Interviewee N6 expressed an “out of sight, out of mind” attitude: “Unfortunately, if someone suffers and then silently does not know how to reach me or contact me or something, yeah that’s bad but that suffering person is sort of [the tree] in the woods that falls and doesn’t make a sound.”</p>
</blockquote>
<p>It’s perhaps worth mentioning at this point that the “N” people are npm users. The attitude of N6 would be fairly typical of Hackage users too, I feel.</p>
<p>Now the paper crosses over to the other side, and looks at library users and how they react to dependency changes. It’s the same people as in the first survey, so these are library users who are also library authors. I wonder if a survey of people who are primarily application authors would be different here. There are three approaches to learning about new library releases:</p>
<ul>
<li>Actively monitoring dependencies. Most people don’t do this.</li>
<li>Having a general social awareness of the field, such as by following people on Twitter.</li>
<li>Reactively waiting for notifications. Most people do this.</li>
</ul>
<p>A common strategy to handling the constant barrage of library updates is to be more careful about what you depend on.</p>
<blockquote>
<p>Interviewee E5 represents a common view: “I only depend on things that are really worthwhile. Because basically everything that you depend on is going to give you pain every so often. And that’s inevitable.”</p>
</blockquote>
<p>Developers use a number of factors to decide if a dependency is worth it:</p>
<ul>
<li>How much they trust the authors</li>
<li>How actively developed it is</li>
<li>The size of its user base</li>
<li>What the authors’ historic approach to breakage has been</li>
</ul>
<p>The paper now mentions as surprising something which I completely expected:</p>
<blockquote>
<p>Interestingly, there was almost no mention of traditional encapsulation strategies to isolate the impact of changes to upstream modules, contra to our expectations and typical software-engineering teaching. Only N6 mentioned developing an abstraction layer between his package and an upstream dependency</p>
</blockquote>
<p>I don’t think I’ve seen a project introduce a layer of abstraction between a dependency and its use, except in cases where one of multiple dependencies will be used (like using one out of several database libraries, but providing a consistent interface). Maybe this would be a good idea sometimes, but I feel like in most situations it’s just adding extra complexity and maintenance burden for little benefit.</p>
<p>The paper wraps up with some discussion of the tension between policies, values, and practice:</p>
<blockquote>
<p>For example there is a tension in Eclipse between the policy and practice of semantic versioning. Eclipse has a long-standing versioning policy similar to semantic versioning and the platform’s stability is reflected in the fact that many packages have not changed their major version number in over 10 years. However, even for the few cases of breaking changes that are clearly documented in the release notes, such as removing deprecated functions, major versions are often not increased, because, as E8 told us, updating a major version number can ripple version updates to downstream packages, and can entail significant work for the downstream projects.</p>
</blockquote>
<p>This is something I struggle with as a library user in Haskell: if I change the version bounds on one of my dependencies, how exactly does that translate into a version change for me? Sometimes it’s not so clear.</p>
<p>So, to conclude:</p>
<blockquote>
<p>How to break an API: In Eclipse, you don’t. In R/CRAN, you reach out to affected downstream developers. In Node.js/npm, you increase the major version number.</p>
</blockquote>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>100 Prisoners</title>
    <link href="https://memo.barrucadu.co.uk/100-prisoners.html" />
    <id>https://memo.barrucadu.co.uk/100-prisoners.html</id>
    <published>2017-11-01T00:00:00Z</published>
    <updated>2017-11-01T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>There’s a popular logic puzzle which goes something like this:</p>
<blockquote>
There are 100 prisoners in solitary cells. There’s a central living room with one light bulb; this bulb is initially off. No prisoner can see the light bulb from his or her own cell. Everyday, the warden picks a prisoner equally at random, and that prisoner visits the living room. While there, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting that all 100 prisoners have been to the living room by now. If this assertion is false, all 100 prisoners are shot. However, if it is indeed true, all prisoners are set free and inducted into MENSA, since the world could always use more smart people. Thus, the assertion should only be made if the prisoner is 100% certain of its validity. The prisoners are allowed to get together one night in the courtyard, to discuss a plan. What plan should they agree on, so that eventually, someone will make a correct assertion?
</blockquote>
<p>We can express this as a concurrency problem: the warden is the scheduler, each prisoner is a thread, and when the program terminates every prisoner should have visited the living room.</p>
<p>Let’s set up some imports:</p>
<pre class="haskell literate"><code>{-# LANGUAGE RankNTypes #-}

import qualified Control.Concurrent.Classy as C
import           Control.Monad             (forever, when)
import           Data.Foldable             (for_)
import           Data.List                 (genericLength)
import           Data.Maybe                (mapMaybe)
import qualified Data.Set                  as S
import qualified Test.DejaFu               as D
import qualified Test.DejaFu.Common        as D
import qualified Test.DejaFu.SCT           as D</code></pre>
<h2 id="correctness">Correctness</h2>
<p>Before we try to implement a solution, let’s think about how we can check if an execution corresponds to the prisoners succeeding an entering MENSA, or failing and being shot.</p>
<p>Prisoners are threads, and the warden is the scheduler. So if every thread (prisoner) that is forked is scheduled (taken to the room), then the prisoners are successful:</p>
<pre class="haskell literate"><code>-- | Check if an execution corresponds to a correct guess.
isCorrect :: D.Trace -&gt; Bool
isCorrect trc = S.fromList (threads trc) == S.fromList (visits trc)

-- | Get all threads created.
threads :: D.Trace -&gt; [D.ThreadId]
threads trc = D.initialThread : mapMaybe go trc where
  go (_, _, D.Fork tid) = Just tid
  go _ = Nothing

-- | Get all scheduled threads
visits :: D.Trace -&gt; [D.ThreadId]
visits = mapMaybe go where
  go (D.Start    tid, _, _) = Just tid
  go (D.SwitchTo tid, _, _) = Just tid
  go _ = Nothing</code></pre>
<p>So now, given some way of setting up the game and running it to completion, we can test it and print some statistics:</p>
<pre class="haskell literate"><code>-- | Run the prison game and print statistics.
run :: D.Way -&gt; (forall m. C.MonadConc m =&gt; m ()) -&gt; IO ()
run way game = do
    traces &lt;- map snd &lt;$&gt; D.runSCT way D.defaultMemType game
    let successes = filter isCorrect traces
    let failures  = filter (not . isCorrect) traces
    putStrLn (show (length traces)    ++ &quot; total attempts&quot;)
    putStrLn (show (length successes) ++ &quot; successes&quot;)
    putStrLn (show (length failures)  ++ &quot; failures&quot;)
    putStrLn (show (avgvisits successes) ++ &quot; average number of room visits per success&quot;)
    putStrLn (show (avgvisits failures)  ++ &quot; average number of room visits per failure&quot;)
    putStrLn &quot;Sample sequences of visits:&quot;
    for_ (take 5 traces) (print . visits)
  where
    avgvisits ts = sum (map (fromIntegral . numvisits) ts) / genericLength ts
    numvisits = sum . map count where
      count (_, _, D.STM _ _) = 1
      count (_, _, D.BlockedSTM _) = 1
      count (_, _, D.Yield) = 1
      count _ = 0</code></pre>
<p>I have decided to assume that a prisoner will either yield (doing nothing) or perform some STM transaction while they’re in the room, to simplify things.</p>
<h2 id="the-perfect-solution">The Perfect Solution</h2>
<p>A slow but simple strategy is for the prisoners to nominate a leader. Only the leader can declare to the warden that everyone has visited the room. Whenever a prisoner other than the leader visits the room, if the light is <em>on</em>, they do nothing; otherwise, if this is their first time in the room with the light off, they turn it on, otherwise they leave it. Whenever the leader enters the room, they turn the light off. When the leader has turned the light off 99 times (or <code>1 - num_prisoners</code> times), they tell the warden that everyone has visited.</p>
<p>Let’s set up those algorithms:</p>
<pre class="haskell literate"><code>-- | The state of the light bulb.
data Light = IsOn | IsOff

-- | Count how many prisoners have toggled the light and terminate
-- when everyone has.
leader :: C.MonadConc m =&gt; Int -&gt; C.TVar (C.STM m) Light -&gt; m ()
leader prisoners light = go 0 where
  go counter = do
    counter&#39; &lt;- C.atomically $ do
      state &lt;- C.readTVar light
      case state of
        IsOn -&gt; do
          C.writeTVar light IsOff
          pure (counter + 1)
        IsOff -&gt; C.retry
    when (counter&#39; &lt; prisoners - 1)
      (go counter&#39;)

-- | Turn the light on once then do nothing.
notLeader :: C.MonadConc m =&gt; C.TVar (C.STM m) Light -&gt; m ()
notLeader light = do
  C.atomically $ do
    state &lt;- C.readTVar light
    case state of
      IsOn  -&gt; C.retry
      IsOff -&gt; C.writeTVar light IsOn
  forever C.yield</code></pre>
<p>So now we just need to create a program where the leader is the main thread and everyone else is a separate thread:</p>
<pre class="haskell literate"><code>-- | Most popular English male and female names, according to
-- Wikipedia.
name :: Int -&gt; String
name i = ns !! (i `mod` length ns) where
  ns = [&quot;Oliver&quot;, &quot;Olivia&quot;, &quot;George&quot;, &quot;Amelia&quot;, &quot;Harry&quot;, &quot;Emily&quot;]

-- | Set up the prison game.  The number of prisoners should be at
-- least 1.
prison :: C.MonadConc m =&gt; Int -&gt; m ()
prison prisoners = do
  light &lt;- C.atomically (C.newTVar IsOff)
  for_ [1..prisoners-1] (\i -&gt; C.forkN (name i) (notLeader light))
  leader prisoners light</code></pre>
<p>Because these are people, not just threads, I’ve given them names. The leader is just called “main” though, how unfortunate for them.</p>
<h3 id="testing">
Testing
</h3>
<p>Now we can try out our system and see if it works:</p>
<pre><code>λ&gt; let runS = run $ D.systematically (D.defaultBounds { D.boundPreemp = Nothing })
λ&gt; runS 1
1 total attempts
1 successes
0 failures
2.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main]

λ&gt; runS 2
5 total attempts
5 successes
0 failures
7.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main,Olivia,main]
[main,Olivia,main,Olivia,main]
[main,Olivia,main,Olivia,main]
[main,Olivia,main,Olivia,main]
[main,Olivia,main]

λ&gt; runS 3
2035 total attempts
2035 successes
0 failures
133.39066339066338 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
(big lists omitted)</code></pre>
<p>This doesn’t scale well. It’s actually a really bad case for concurrency testing: every thread is messing with the same shared state, so dejafu has to try all the orderings. Not good.</p>
<p>Taking another look at our prisoners, we can see two things which a human would use to decide whether some schedules are redundant or not:</p>
<ol type="1">
<li><p>If we adopt any schedule other than alternating leader / non-leader, threads will block without doing anything. So we should alternate.</p></li>
<li><p>When a non-leader has completed their task, they will always yield. So we should never schedule a prisoner who will yield.</p></li>
</ol>
<p>Unfortunately dejafu can’t really make use of (1). It could be inferred <em>if</em> dejafu was able to compare values inside <code>TVar</code>s, rather than just seeing that there had been a write. But Haskell doesn’t let us do that without slapping an <code>Eq</code> constraint on <code>writeTVar</code>, which I definitely don’t want to do (although maybe having a separate <code>eqwriteTVar</code>, <code>eqputMVar</code>, and so on would be a nice addition).</p>
<p>Fortunately, dejafu <em>can</em> do something with (2). It already bounds the maximum number of times a thread can yield, so that we can test constructs like spinlocks. This is called <em>fair bounding</em>. The default bound is 5, but if we set it to 0 dejafu will just never schedule a thread which is going to yield. Here we go:</p>
<pre><code>λ&gt; let runS = run $ D.systematically (D.defaultBounds { D.boundPreemp = Nothing, D.boundFair = Just 0 })
λ&gt; runS 1
1 total attempts
1 successes
0 failures
2.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main]

λ&gt; runS 2
1 total attempts
1 successes
0 failures
4.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main]

λ&gt; runS 3
4 total attempts
4 successes
0 failures
7.5 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main,George,main]
[main,Olivia,George,main,George,main]
[main,George,main,Olivia,main]
[main,George,Olivia,main,Olivia,main]</code></pre>
<p>Much better! Although it still doesn’t scale as nicely as we’d like</p>
<pre><code>λ&gt; runS 4
48 total attempts
48 successes
0 failures
11.5 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main,George,main,Amelia,main]
[main,Olivia,main,George,Amelia,main,Amelia,main]
[main,Olivia,main,Amelia,main,George,main]
[main,Olivia,main,Amelia,George,main,George,main]
[main,Olivia,George,main,George,main,Amelia,main]

λ&gt; runS 5
1536 total attempts
1536 successes
0 failures
16.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main,George,main,Amelia,main,Harry,main]
[main,Olivia,main,George,main,Amelia,Harry,main,Harry,main]
[main,Olivia,main,George,main,Harry,main,Amelia,main]
[main,Olivia,main,George,main,Harry,Amelia,main,Amelia,main]
[main,Olivia,main,George,Amelia,main,Amelia,main,Harry,main]

λ&gt; runS 6
122880 total attempts
122880 successes
0 failures
21.0 average number of room visits per success
NaN average number of room visits per failure
Sample sequences of visits:
[main,Olivia,main,George,main,Amelia,main,Harry,main,Emily,main]
[main,Olivia,main,George,main,Amelia,main,Harry,Emily,main,Emily,main]
[main,Olivia,main,George,main,Amelia,main,Emily,main,Harry,main]
[main,Olivia,main,George,main,Amelia,main,Emily,Harry,main,Harry,main]
[main,Olivia,main,George,main,Amelia,Harry,main,Harry,main,Emily,main]</code></pre>
<p>The prisoners are stepping on each other’s toes and causing needless work. This is probably as good as we can do without adding some extra primitives to dejafu to optimise the case where we have an <code>Eq</code> instance available, unfortunately.</p>
<h3 id="a-silver-lining">
A Silver Lining
</h3>
<p>In concurrency testing terms, six threads is actually quite a lot.</p>
<p><a href="http://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2014/PPoPP.pdf">Empirical studies</a> have found that many concurrency bugs can be exhibited with only two or three threads! Furthermore, most real-world concurrent programs don’t have every single thread operating on the same bit of shared state.</p>
<h2 id="the-good-enough-solution">The “Good-Enough” Solution</h2>
<p>There’s another school of thought which says to just wait for three years, because by then it’s very unlikely that any single prisoner had never visited the room. In fact, we would expect each prisoner to have been to the room ten times by then, assuming the warden is fair.</p>
<p>By keeping track of how many days have passed, we can try this out as well:</p>
<pre class="haskell literate"><code>leader :: C.MonadConc m =&gt; Int -&gt; C.TVar (C.STM m) Int -&gt; m ()
leader prisoners days = C.atomically $ do
  numDays &lt;- C.readTVar days
  C.check (numDays &gt;= (prisoners - 1) * 10)

notLeader :: C.MonadConc m =&gt; C.TVar (C.STM m) Int -&gt; m ()
notLeader days = forever . C.atomically $ C.modifyTVar days (+1)

prison :: C.MonadConc m =&gt; Int -&gt; m ()
prison prisoners = do
  days &lt;- C.atomically (C.newTVar 0)
  for_ [1..prisoners-1] (\i -&gt; C.forkN (name i) (notLeader days))
  leader prisoners days</code></pre>
<p>Now let’s see how these brave prisoners do (sample visit sequences omitted because they’re pretty long):</p>
<pre><code>λ&gt; let runR = run $ D.uniformly (R.mkStdGen 0) 100
λ&gt; runR 1
100 total attempts
100 successes
0 failures
2.0 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 2
100 total attempts
100 successes
0 failures
18.35 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 3
100 total attempts
100 successes
0 failures
31.92 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 4
100 total attempts
100 successes
0 failures
43.52 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 5
100 total attempts
100 successes
0 failures
55.88 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 6
100 total attempts
100 successes
0 failures
67.37 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 7
100 total attempts
100 successes
0 failures
77.05 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 8
100 total attempts
99 successes
1 failures
90.4040404040404 average number of room visits per success
81.0 average number of room visits per failure

λ&gt; runR 9
100 total attempts
100 successes
0 failures
101.64 average number of room visits per success
NaN average number of room visits per failure

λ&gt; runR 10
100 total attempts
100 successes
0 failures
114.89 average number of room visits per success
NaN average number of room visits per failure</code></pre>
<p>Not bad at all! Although my puny VPS still can’t manage all 100.</p>

      ]]>
    </summary>
  </entry>
  
  <entry>
    <title>Writing a Concurrency Testing Library (Part 2): Exceptions</title>
    <link href="https://memo.barrucadu.co.uk/minifu-02.html" />
    <id>https://memo.barrucadu.co.uk/minifu-02.html</id>
    <published>2017-10-28T00:00:00Z</published>
    <updated>2017-10-28T00:00:00Z</updated>
    <summary type="html">
      <![CDATA[
<p>Welcome back to my series on implementing a concurrency testing library for Haskell. This is part 2 of the series, and today we’ll implement exceptions. If you missed part 1, you can read it <a href="minifu-01.html">here</a>.</p>
<p>As before, all code is available on <a href="https://github.com/barrucadu/minifu">GitHub</a>. The code for this post is under the “post-02” tag.</p>
<hr />
<p>Did you do last time’s homework task? It was to implement this interface:</p>
<pre class="haskell"><code>data CRef m a = -- ...

newCRef :: a -&gt; MiniFu m (CRef m a)

readCRef :: CRef m a -&gt; MiniFu m a

writeCRef :: CRef m a -&gt; a -&gt; MiniFu m ()

atomicModifyCRef :: CRef m a -&gt; (a -&gt; (a, b)) -&gt; MiniFu m b</code></pre>
<p>Here are my solutions, available at the “homework-01” tag:</p>
<ol type="1">
<li>(<a href="https://github.com/barrucadu/minifu/commit/2070bdfaf5174fc14f6835d8410988cf111a854a"><code>2070bdf</code></a>) Add the <code>CRef</code> type, the <code>PrimOp</code> constructors, and the wrapper functions</li>
<li>(<a href="https://github.com/barrucadu/minifu/commit/188eec562f619c26fe117dd891ff86befc27b5a2"><code>188eec5</code></a>) Implement the primops</li>
</ol>
<p>I also made some changes, available at the “pre-02” tag:</p>
<ol type="1">
<li>(<a href="https://github.com/barrucadu/minifu/commit/7ce6e41f8bdc60c73affa00f7760a46a7e6ecfc3"><code>7ce6e41</code></a>) Add a helper for primops which don’t create any identifiers</li>
<li>(<a href="https://github.com/barrucadu/minifu/commit/24197965787555c5552ce8cb70fcb078016a167c"><code>2419796</code></a>) Move some definitions into an internal module</li>
<li>(<a href="https://github.com/barrucadu/minifu/commit/9c49f9d76f27ce0fa1ed445c34d9107105e66171"><code>9c49f9d</code></a>) Change the type of the <code>block</code> helper to <code>MVarId -&gt; Threads m -&gt; Threads m</code></li>
<li>(<a href="https://github.com/barrucadu/minifu/commit/dabd84b1ed4f713889b607b142ecb2d1987ee804"><code>dabd84b</code></a>) Implement <code>readMVar</code></li>
</ol>
<p>Now on to the show…</p>
<h2 id="synchronous-exceptions">Synchronous exceptions</h2>
<p>We can’t implement exceptions with what we have already. We’re going to need some new primops. I think you’re getting a feel for how this works now, so I won’t drag this out. Here we go:</p>
<pre class="haskell"><code>import qualified Control.Exception as E

data PrimOp m where
  -- ...
  Throw :: E.Exception e =&gt; e -&gt; PrimOp m
  Catch :: E.Exception e =&gt; MiniFu m a -&gt; (e -&gt; MiniFu m a) -&gt; (a -&gt; PrimOp m)
        -&gt; PrimOp m
  PopH  :: PrimOp m -&gt; PrimOp m

throw :: E.Exception e =&gt; e -&gt; MiniFu m a
throw e = MiniFu (K.cont (\_ -&gt; Throw e))

catch :: E.Exception e =&gt; MiniFu m a -&gt; (e -&gt; MiniFu m a) -&gt; MiniFu m a
catch act h = MiniFu (K.cont (Catch act h))</code></pre>
<p>Throwing an exception with <code>throw</code> jumps back to the closest enclosing <code>catch</code> with an exception handler of the appropriate type, killing the thread if there is none. The <code>PopH</code> primop will pop the top exception handler from the stack. We’ll insert those as appropriate when entering a <code>catch</code>.</p>
<p>Before we can actually implement these primops, we need to give threads a place to store their exception handlers. You might have guessed it when I said “stack”: we’ll just give every thread a list of them. This requires changing our <code>Thread</code> type and <code>thread</code> function:</p>
<pre class="haskell"><code>data Thread m = Thread
  { threadK     :: PrimOp m
  , threadBlock :: Maybe MVarId
  , threadExc   :: [Handler m]                              -- &lt;- new
  }

data Handler m where
  Handler :: E.Exception e =&gt; (e -&gt; PrimOp m) -&gt; Handler m

thread :: PrimOp m -&gt; Thread m
thread k = Thread
  { threadK     = k
  , threadBlock = Nothing
  , threadExc   = []                                        -- &lt;- new
  }</code></pre>
<p>As <code>Exception</code> is a subclass of <code>Typeable</code>, given some exception value we’re able to look for the first matching handler:</p>
<pre class="haskell"><code>raise :: E.Exception e =&gt; e -&gt; Thread m -&gt; Maybe (Thread m)
raise exc thrd = go (threadExc thrd) where
  go (Handler h:hs) = case h &lt;$&gt; E.fromException exc&#39; of
    Just pop -&gt; Just (thrd { threadK = pop, threadBlock = Nothing, threadExc = hs })
    Nothing  -&gt; go hs
  go [] = Nothing

  exc&#39; = E.toException exc</code></pre>
<p>If <code>raise</code> returns a <code>Just</code>, then a handler was found and entered. Otherwise, no handler exists and the thread should be removed from the <code>Threads</code> collection. This can be expressed rather nicely as <code>M.update . raise</code>.</p>
<p>Now we have enough support to implement the primops:</p>
<pre class="haskell"><code>stepThread {- ... -}
  where
    -- ...
    go (Throw e) =
      simple (M.update (raise e) tid)
    go (Catch (MiniFu ma) h k) = simple . adjust $ \thrd -&gt; thrd
      { threadK   = K.runCont ma (PopH . k)
      , threadExc =
        let h&#39; exc = K.runCont (runMiniFu (h exc)) k
        in Handler h&#39; : threadExc thrd
      }
    go (PopH k) = simple . adjust $ \thrd -&gt; thrd
      { threadK   = k
      , threadExc = tail (threadExc thrd)
      }</code></pre>
<p>Let’s break that down:</p>
<ul>
<li><code>Throw</code> just re-uses our <code>raise</code> function to either jump to the exception handler or kill the thread.</li>
<li><code>Catch</code> changes the continuation of the thread to run the enclosed action, then do a <code>PopH</code> action, then run the outer action. It also adds an exception continuation, which just runs the exception handler, then runs the outer action.</li>
<li><code>PopH</code> just removes the head exception continuation.</li>
</ul>
<p>It’s important that the exception continuation <em>doesn’t</em> use <code>PopH</code> to remove itself: that happens in <code>raise</code> when an exception is thrown. When writing this section I realised I’d made that mistake in dejafu (<a href="https://github.com/barrucadu/dejafu/issues/139">#139</a>)!</p>
<h3 id="so-what">So what?</h3>
<p>So now we can use synchronous exceptions! Here’s an incredibly contrived example:</p>
<pre class="haskell"><code>{-# LANGUAGE ScopedTypeVariables #-}

import Control.Monad (join)

example_sync :: MiniFu m Int
example_sync = do
  a &lt;- newEmptyMVar
  fork (putMVar a (pure 1))
  fork (putMVar a (throw E.NonTermination))
  fork (putMVar a (throw E.AllocationLimitExceeded))
  catch
    (catch
      (join (readMVar a))
      (\(_ :: E.AllocationLimitExceeded) -&gt; pure 2))
    (\(_ :: E.NonTermination) -&gt; pure 3)

demo_sync :: IO ()
demo_sync = do
  g &lt;- R.newStdGen
  print . fst =&lt;&lt; minifu randomSched g example_sync</code></pre>
<p>If we run this a few times in ghci, we can see the different exceptions being thrown and caught (resulting in different outputs):</p>
<pre><code>λ&gt; demo_sync
Just 1
λ&gt; demo_sync
Just 3
λ&gt; demo_sync
Just 3
λ&gt; demo_sync
Just 2</code></pre>
<h3 id="monadthrow-and-monadcatch">MonadThrow and MonadCatch</h3>
<p><code>MonadConc</code> has a bunch of superclasses, and we can now implement two of them!</p>
<pre class="haskell"><code>import qualified Control.Monad.Catch as EM

instance EM.MonadThrow (MiniFu m) where
  throwM = -- &#39;throw&#39; from above

instance EM.MonadCatch (MiniFu m) where
  catch = -- &#39;catch&#39; from above</code></pre>
<p>The <a href="https://hackage.haskell.org/package/exceptions">exceptions</a> package provides the <code>MonadThrow</code>, <code>MonadCatch</code>, and <code>MonadMask</code> typeclasses, so we can talk about exceptions in a wider context than just <code>IO</code>. We’ll get on to <code>MonadMask</code> when we look at asynchronous exceptions.</p>
<h3 id="incompleteness">Incompleteness!</h3>
<p>It is with exceptions that we hit the first thing we can’t do in MiniFu.</p>
<p>When in <code>IO</code>, we can catch exceptions from pure code:</p>
<pre><code>λ&gt; import Control.Exception
λ&gt; evaluate undefined `catch` \e -&gt; putStrLn (&quot;Got &quot; ++ show (e :: SomeException))
Got Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at &lt;interactive&gt;:5:10 in interactive:Ghci2</code></pre>
<p>But we can’t do that in <code>MiniFu</code>, as there’s no suitable <code>evaluate</code> function.</p>
<p>Should there be an <code>evaluate</code> in the <code>MonadConc</code> class? I’m unconvinced, as it’s not really a <em>concurrency</em> operation.</p>
<p>Should we constrain the <code>m</code> in <code>MiniFu m</code> to be a <code>MonadIO</code>, which would let us call <code>evaluate</code>? Perhaps, that would certainly be a way to do it, and I’m currently investigating the advantages of an <code>IO</code> base monad for dejafu (although originally for a different reason).</p>
<h2 id="asynchronous-exceptions">Asynchronous exceptions</h2>
<p>Asynchronous exceptions are like synchronous exceptions, except for two details:</p>
<ol type="1">
<li>They are thrown to a thread identified by <code>ThreadId</code>. We can do this already with <code>raise</code>.</li>
<li>Raising the exception may be blocked due to the target thread’s <em>masking state</em>. We need to do some extra work to implement this.</li>
</ol>
<p>When a thread is masked, attempting to deliver an asynchronous exception to it will block. There are three masking states:</p>
<ul>
<li><code>Unmasked</code>, asynchronous exceptions are unmasked.</li>
<li><code>MaskedInterruptible</code>, asynchronous exceptions are masked, but blocked operations may still be interrupted.</li>
<li><code>MaskedUninterruptible</code>, asynchronous exceptions are masked, and blocked operations may not be interrupted.</li>
</ul>
<p>So we’ll add the current masking state to our <code>Thread</code> type, defaulting to <code>Unmasked</code>, and also account for blocking on another thread:</p>
<pre class="haskell"><code>data Thread m = Thread
  { threadK     :: PrimOp m
  , threadBlock :: Maybe (Either ThreadId MVarId)           -- &lt;- new
  , threadExc   :: [Handler m]
  , threadMask  :: E.MaskingState                           -- &lt;- new
  }

thread :: PrimOp m -&gt; Thread m
thread k = Thread
  { threadK     = k
  , threadBlock = Nothing
  , threadExc   = []
  , threadMask  = E.Unmasked                                -- &lt;- new
  }</code></pre>
<p>We’ll also need a primop to set the masking state:</p>
<pre class="haskell"><code>data PrimOp m where
  -- ...
  Mask :: E.MaskingState -&gt; PrimOp m -&gt; PrimOp m</code></pre>
<p>Which has a fairly straightforward implementation:</p>
<pre class="haskell"><code>stepThread {- ... -}
  where
    -- ...
    go (Mask ms k) = simple . adjust $ \thrd -&gt; thrd
      { threadK    = k
      , threadMask = ms
      }</code></pre>
<p>Finally, we need to make sure that if an exception is raised, and we jump into an exception handler, the masking state gets reset to what it was when the handler was created. This means we need a small change to the <code>Catch</code> primop:</p>
<pre class="haskell"><code>stepThread {- ... -}
  where
    -- ...
    go (Catch (MiniFu ma) h k) = simple . adjust $ \thrd -&gt; thrd
      { threadK   = K.runCont ma (PopH . k)
      , threadExc =
        let ms0 = threadMask thrd                           -- &lt;- new
            h&#39; exc = flip K.runCont k $ do
              K.cont (\c -&gt; Mask ms0 (c ()))                -- &lt;- new
              runMiniFu (h exc)
        in Handler h&#39; : threadExc thrd
      }</code></pre>
<p>Alright, now we have enough background to actually implement the user-facing operations.</p>
<h3 id="throwing">Throwing</h3>
<p>To throw an asynchronous exception, we’re going to need a new primop:</p>
<pre class="haskell"><code>data PrimOp m where
  -- ...
  ThrowTo :: E.Exception e =&gt; ThreadId -&gt; e -&gt; PrimOp m -&gt; PrimOp m</code></pre>
<p>Which has a corresponding wrapper function:</p>
<pre class="haskell"><code>throwTo :: E.Exception e =&gt; ThreadId -&gt; e -&gt; MiniFu m ()
throwTo tid e = MiniFu (K.cont (\k -&gt; ThrowTo tid e (k ())))</code></pre>
<p>Let’s think about the implementation of the <code>ThrowTo</code> primop. It first needs to check if the target thread is interruptible and, if so, raises the exception in that thread; if not, it blocks the current thread. A thread is interruptible if its masking state is <code>Unmasked</code>, or <code>MaskedInterruptible</code> and it’s currently blocked.</p>
<p>Let’s encapsulate that logic:</p>
<pre class="haskell"><code>import Data.Maybe (isJust)

isInterruptible :: Thread m -&gt; Bool
isInterruptible thrd =
  threadMask thrd == E.Unmasked ||
  (threadMask thrd == E.MaskedInterruptible &amp;&amp; isJust (threadBlock thrd))</code></pre>
<p>Given that, the implementation of <code>ThrowTo</code> is straightforward:</p>
<pre class="haskell"><code>stepThread {- ... -}
  where
    -- ...
    go (ThrowTo threadid e k) = simple $ case M.lookup threadid threads of
      Just t
        | isInterruptible t -&gt; goto k . M.update (raise e) threadid
        | otherwise         -&gt; block (Left threadid)
      Nothing -&gt; goto k</code></pre>
<p>First, check if the thread exists. Then check if it’s interruptible: if it is, raise the exception, otherwise block. If the thread doesn’t exist any more, just continue.</p>
<p>Now we just need to handle <em>unblocking</em> threads which are blocked in <code>ThrowTo</code>. For that, we’ll go back to the <code>run</code> function and add a pass to unblock threads if the current one is interruptible after it processes its action:</p>
<pre class="haskell"><code>run :: C.MonadConc m =&gt; Scheduler s -&gt; s -&gt; PrimOp m -&gt; m s
run sched s0 = go s0 . initialise where
  go s (threads, idsrc)
    | initialThreadId `M.member` threads = case runnable threads of
      Just tids -&gt; do
        let (chosen, s&#39;) = sched tids s
        (threads&#39;, idsrc&#39;) &lt;- stepThread chosen (threads, idsrc)
        let threads&#39;&#39; = if (isInterruptible &lt;$&gt; M.lookup chosen threads&#39;) /= Just False
                        then unblock (Left chosen) threads&#39;
                        else threads&#39;
            -- ^- new
        go s&#39; (threads&#39;&#39;, idsrc&#39;)
      Nothing -&gt; pure s
    | otherwise = pure s

  runnable = nonEmpty . M.keys . M.filter (isNothing . threadBlock)

  initialThreadId = fst (nextThreadId initialIdSource)</code></pre>
<p>So after stepping a thread, we unblock every thread blocked on it if it either doesn’t exist, of if it does exist and is interruptible. It’s much more robust to do this once here than everywhere in <code>stepThread</code> which might cause the thread to become interruptible.</p>
<h3 id="masking-and-monadmask">Masking and MonadMask</h3>
<p>There are two operations at the programmer’s disposal to change the masking state of a thread, <code>mask</code> and <code>uninterruptibleMask</code>. Here’s what the <code>MiniFu</code> types will look like:</p>
<pre class="haskell"><code>{-# LANGUAGE RankNTypes #-}

mask                :: ((forall x. MiniFu m x -&gt; MiniFu m x) -&gt; MiniFu m a) -&gt; MiniFu m a
uninterruptibleMask :: ((forall x. MiniFu m x -&gt; MiniFu m x) -&gt; MiniFu m a) -&gt; MiniFu m a</code></pre>
<p>Each takes an action to run, and runs it as either <code>MaskedInterruptible</code> or <code>MaskedUninterruptible</code>. The action is provided with a polymorphic callback to run a subcomputation with the original masking state.</p>
<p>This is going to need, you guessed it, a new primop! We <em>could</em> modify the <code>Mask</code> primop to do this job as well, but I think it’s a little clearer to have two separate ones:</p>
<pre class="haskell"><code>data PrimOp m where
  -- ...
  InMask :: E.MaskingState -&gt; ((forall x. MiniFu m x -&gt; MiniFu m x) -&gt; MiniFu m a)
         -&gt; (a -&gt; PrimOp m) -&gt; PrimOp m</code></pre>
<p>And here’s the implementations of our masking functions:</p>
<pre class="haskell"><code>mask ma = MiniFu (K.cont (InMask E.MaskedInterruptible ma))
uninterruptibleMask ma = MiniFu (K.cont (InMask E.MaskedUninterruptible ma))</code></pre>
<p>We can now fulfil another requirement of <code>MonadConc</code>: a <code>MonadMask</code> instance!</p>
<pre class="haskell"><code>instance MonadMask (MiniFu m) where
  mask = -- &#39;mask&#39; from above
  uninterruptibleMask = -- &#39;uninterruptibleMask&#39; from above</code></pre>
<p>The very last piece of the puzzle for exception handling in MiniFu is to implement this <code>InMask</code> primop. Its type looks quite intense, but the implementation is really not that bad. There are three parts:</p>
<pre class="haskell"><code>stepThread {- ... -}
  where
    -- ...
    go (InMask ms ma k) = simple . adjust $ \thrd -&gt; thrd
      { threadK =
        let ms0 = threadMask thrd

            -- (1) we need to construct the polymorphic argument function
            umask :: MiniFu m x -&gt; MiniFu m x
            umask (MiniFu mx) = MiniFu $ do
              K.cont (\c -&gt; Mask ms0 (c ()))
              x &lt;- mx
              K.cont (\c -&gt; Mask ms (c ()))
              pure x

        -- (2) we need to run the inner continuation, resetting the masking state
        -- when done
        in K.runCont (runMiniFu (ma umask)) (Mask ms0 . k)

      -- (3) we need to change the masking state
      , threadMask = ms
      }</code></pre>
<p>The explicit type signature on <code>umask</code> is needed because we’re using <code>GADTs</code>, which implies <code>MonoLocalBinds</code>, which prevents the polymorphic type from being inferred. We could achieve the same effect by turning on <code>NoMonoLocalBinds</code>.</p>
<h3 id="demo">Demo</h3>
<p>Now we have asynchronous exceptions, check it out:</p>
<pre class="haskell"><code>example_async :: MiniFu m String
example_async = do
  a &lt;- newEmptyMVar
  tid &lt;- fork (putMVar a &quot;hello from the other thread&quot;)
  throwTo tid E.ThreadKilled
  readMVar a

demo_async :: IO ()
demo_async = do
  g &lt;- R.newStdGen
  print . fst =&lt;&lt; minifu randomSched g example_async</code></pre>
<p>See:</p>
<pre><code>λ&gt; demo_async
Just &quot;hello from the other thread&quot;
λ&gt; demo_async
Just &quot;hello from the other thread&quot;
λ&gt; demo_async
Nothing</code></pre>
<h2 id="next-time">Next time…</h2>
<p>We have come to the end of part 2! Again, I hope you enjoyed this post, any feedback is welcome. This is all on <a href="https://github.com/barrucadu/minifu">GitHub</a>, and you can see the code we ended up with at the “post-02” tag.</p>
<p>Once again, I have some homework for you. Your task, should you choose to accept it, is to implement:</p>
<pre class="haskell"><code>tryPutMVar :: MVar m a -&gt; a -&gt; MiniFu m Bool

tryTakeMVar :: MVar m a -&gt; MiniFu m (Maybe a)

tryReadMVar :: MVar m a -&gt; MiniFu m (Maybe a)</code></pre>
<p>Solutions will be up in a few days, as before, at the “homework-02” tag.</p>
<p>Stay tuned because next time we’re going to implement STM: all of it in one go. Then we can finally get on to the testing.</p>
<hr />
<p>Thanks to <a href="https://twitter.com/willsewell_">Will Sewell</a> for reading an earlier draft of this post.</p>

      ]]>
    </summary>
  </entry>
  
</feed>