Systematic Concurrency Testing and Daemon Threads

Sys­tem­at­ic­ally testing con­cur­rent pro­grams is hard be­cause, for any­thing non­trivial, there are a lot of po­ten­tial ex­e­cu­tions. There are a few ap­proaches to solving this prob­lem, one of which is par­tial-order re­duc­tion, which forms the basis of de­jafu’s testing today (and is avail­able by it­self in the dpor lib­rary). This is the ap­proach, in out­line:

  1. Run your con­cur­rent pro­gram, re­cording an ex­e­cu­tion trace.
  2. Look back through the trace for pairs of ac­tions which might give a dif­ferent result if done in the other or­der. These are called de­pendent ac­tions.
  3. For each pair of de­pendent ac­tions, at­tempt to schedule the latter be­fore the former by in­serting a con­text switch.

It works pretty well. The space of po­ten­tial sched­ules is cut down by a huge pro­por­tion. In pa­pers presenting spe­cific par­tial-order al­gorithms, the au­thors will typ­ic­ally give a little ex­ample like so:

We as­sume a core con­cur­rent lan­guage con­sisting of reads and writes to shared vari­ables. In this con­text, two reads are in­de­pend­ent, but two writes, or a read and a write, are de­pendent if to the same vari­able.

So here’s a little ex­ample, we have two threads, T0 and T1, which are ex­ecuting con­cur­rently:

T0:  read x        T1:  read x
                        write x 1

Ini­tially x = 0. If we take the result of this pro­gram to be the value of x that T0 reads, then there are clearly two pos­sible res­ults, and here are pos­sible traces which could lead to them:

T0:  read x           T1: read x
T1:  read x           T1: write x 1
T1:  write x 1        T0: read x
result = 0            result = 1

If we run this pro­gram under a non-­pree­mptive sched­uler starting with T0, it’ll run T0 until it ter­min­ates, then run T1, giving the left trace above. The al­gorithm will then note the de­pend­ency between T1: write x 1 and T0: read x, and so schedule T1 be­fore T0, giving the right trace above. This will then give rise to a third ex­e­cu­tion:

T1:  read x
T0:  read x
T1:  write x 1
result = 0

But it doesn’t change the res­ult. You might think an ob­vious op­tim­isa­tion is to apply the con­text switch be­fore as many in­de­pendent ac­tions as pos­sible, which would then not give a new unique schedule to try. Un­for­tu­nately this isn’t sound in gen­eral be­cause what if the write is con­di­tional on the read? The ex­e­cu­tion trace only con­tains vis­ible ac­tions, if-state­ments and the like are not shown.

I have made an as­sump­tion when I ran that pro­gram, can you see what it is?

[pause for dra­matic ef­fect]

I am as­suming that the pro­gram only ter­min­ates after every thread ter­min­ates! If the pro­gram in­stead ter­min­ates when T0 ter­min­ates, we would get this trace:

T0: read x
result = 0

There is no de­pend­ency between reads, so T1 would never be sched­uled. In this con­text, T1 is a daemon thread. Daemon threads do not block pro­gram ter­min­a­tion, they are killed by the runtime after all non-daemon threads ter­min­ate. In Haskell, all threads other than the main thread are demon threads. Vari­ants of this problem have cropped up in a few places in de­jafu and bitten me. I couldn’t find any­thing written about this on­line, so I de­cided to doc­u­ment it for pos­ter­ity.

There are two solu­tions:

  1. Make your de­pend­ency func­tion smarter, by having a spe­cial case: two ac­tions are de­pendent if one of them is the last ac­tion in the trace. This handles daemon threads, and also cases where the pro­gram did not ter­minate grace­fully but was aborted by the testing frame­work: for in­stance, if you bound the length of an ex­e­cu­tion, you still want to ex­plore dif­ferent sched­ules within that bound.

  2. Add a spe­cial “stop” ac­tion to the end of the main thread, which is de­pendent with everything. This does not handle the case where ex­e­cu­tion is aborted by the test frame­work, only when ex­e­cu­tion ter­min­ates grace­fully.

The dpor lib­rary im­ple­ments a variant of case 1, to ac­com­modate length-bound­ing. I’m not con­vinced it’s a great solu­tion, as it leads to a lot of ad­di­tional sched­ules being ex­plored which are then quickly abor­ted, but I’m not sure of a better ap­proach yet. The de­jafu lib­rary im­ple­ments case 2, be­cause having a spe­cial “stop” ac­tion also gives you a con­venient place to clean up everything as­so­ci­ated with a thread.

So that’s it: daemon threads do work within the con­text of par­tial-order re­duc­tion, but you need to make sure you’re ex­pli­citly hand­ling the ter­min­a­tion of the main thread.

concurrency, programming, research
Target Audience
Mostly me, & possibly concurrency testing nerds.
Epistemic Status
I guess it's possible I've misunderstood every paper on concurrency testing I've read and they actually do address this topic.
Attention Conservation Notice
There's a subtlety in partial-order techniques when daemon threads are involved which has caught me out a couple of times.