Reducing Combinatorial Explosion

Testing con­cur­rent pro­grams is hard, be­cause there are many valid sched­ules: but only one will be used by a given ex­e­cu­tion. This means we can’t just run some­thing once to be con­fident it’s right.

  • How many times do we need to run it?
  • Does the number of runs we need grow with re­la­tion to the pro­gram?
  • Is this even a good ap­proach?

Run­ning the pro­gram lots of times might be ok if every run uses a unique schedule, oth­er­wise you’ll ex­plore less of the pos­sib­ility space than you might think. Even if you can guar­antee unique­ness of sched­ules, larger pro­grams tend to have more pos­sible sched­ules, so you need more runs to get the same cov­erage guar­an­tee: but it’s dif­fi­cult to know ex­actly how many.

So random testing is out. Let’s see what the al­tern­at­ives are.

Schedule Bounding

Com­plete­ness in con­cur­rency testing is hard, we’d already given up on it when we con­sidered random test­ing, so why not throw it out in a much more prin­cipled fash­ion?

Enter schedule bound­ing. Here, we define some bound func­tion which, given a list of scheduling de­cisions, will de­termine if that is within the bound or not. We then test all sched­ules within the bound.

There are a few bound func­tions in common use:

  • Pre-emp­tion bounding: re­stricting the number of pre-emptive con­text switches.
  • Delay bounding: re­stricting the number of de­vi­ations from an oth­er­wise de­term­in­istic sched­uler.
  • Fair bounding: re­stricting the number of con­sec­utive times a non-b­locking loop ac­cessing shared state (like a spin­lock) is ex­ecuted.

Fur­ther­more, these bound func­tions are often it­er­ated. For ex­ample, trying all sched­ules with 0 pre-emp­tions, then trying all sched­ules with 1 pre-emp­tion, and so on, up to some limit.

Pre-emp­tion bounding is a common one, and em­pir­ical studies1 have found that test cases with as few as two threads and two pre-emp­tions are enough to find many con­cur­rency bugs.

Di­ver­sion: Gen­er­ating Sched­ules

You may be won­dering how schedule bounding can ac­tu­ally be im­ple­mented. Maybe you’re won­dering if you simply mon­itor the ex­e­cu­tion and abort it if it ex­ceeds the bound.

Well, that would cer­tainly en­force the bound, but it wouldn’t give you many cov­erage guar­an­tees in fi­nite time.

By ex­ecuting a pro­gram you can gather a lot of in­form­a­tion: things like, what threads are run­nable at each step (and what they would do if you sched­uled them), what the thread that was sched­uled did, the state of all shared vari­ables (for CVars/M­Vars, this is whether they’re full or empty). You can use this to in­form your ini­tial set of scheduling de­cisions when starting a new ex­e­cu­tion, to sys­tem­at­ic­ally ex­plore the pos­sib­ility space.

You can store this in­form­a­tion in a tree struc­ture mod­i­fied between ex­e­cu­tions: each edge rep­res­ents a scheduling de­cision, each node con­tains in­form­a­tion on the threads run­nable at that point, and the al­tern­ative de­cisions still to make. This turns out to be quite a space-ef­fi­cient rep­res­ent­a­tion, as schedule pre­fixes are ex­pli­citly shared as much as pos­sible.

Schedule tree

Schedule tree

Blocking Avoid­ance

If we have a bunch of run­nable threads, but some of them will block im­me­di­ately without modi­fying any shared state, then we can re­strict the set of run­nable threads to those which won’t block.

Scheduling choice leading to immediate block

Scheduling choice leading to im­me­diate block

This is safe when there’s no schedule bounding hap­pen­ing, or when it can’t result in oth­er­wise legal sched­ules no longer being legal.

In the case of pre-emp­tion bound­ing, this is safe be­cause it doesn’t alter the pre-emp­tion count of any sched­ules reach­able from this state, as if a thread blocks then any other thread can be ex­ecuted without in­cur­ring an extra pre-emp­tion.

Par­tial-order Re­duc­tion

Elim­in­ating sched­ules which ob­vi­ously don’t change the state is a nice step, but it’s only a first step.

We can char­ac­terise the ex­e­cu­tion of a con­cur­rent pro­gram by the or­dering of de­pendent ac­tions, such as reads and writes to the same vari­able. This is a par­tial order on the pro­gram ac­tions, for which there may be many total or­ders. Ideally, we would only check one total order for each par­tial or­der, as dif­ferent total or­ders will have the same res­ult.

Two executions with the same result, amenable to POR

Two ex­e­cu­tions with the same res­ult, amen­able to POR

Par­tial-order re­duc­tion (POR) can be im­ple­mented by only ex­ploring mul­tiple scheduling de­cisions (when there is a choice) if they can in­ter­fere with each other.

Un­for­tu­nately, POR isn’t quite that simple when using schedule bound­ing, as it can im­pose de­pend­en­cies between pre­vi­ously-un­re­lated ac­tions, as they can af­fect whether a state is reach­able within the bound or not.

How this is solved de­pends on the spe­cific bound func­tion used. For pre-emp­tion bound­ing, it suf­fices to try dif­ferent pos­sib­il­ities when a con­text switch hap­pens. Fur­ther­more, when im­ple­menting blocking avoid­ance, don’t re­move the con­text switch en­tirely, in­stead per­form it earlier in the ex­e­cu­tion, where it won’t block.

Sleep Sets

Un­for­tu­nately, the con­text switches in­tro­duced by POR can still result in the same pro­gram state being reached by mul­tiple dif­ferent sched­ules.

A state-space amenable to sleep sets, each node represents a state

A state-space amen­able to sleep sets, each node rep­res­ents a state

Sleep sets are a com­ple­mentary im­prove­ment, which do not re­quire POR. The idea is that, if you have two or more de­cisions to make, and you have ex­plored one pos­sib­il­ity, there’s no point in making that same de­cision when trying the other pos­sib­il­ity, un­less some­thing hap­pens which can alter the res­ult. If nothing has changed, you’ll get the same res­ult.

The same state space pruning transitions with sleep sets

The same state space pruning trans­itions with sleep sets

The Déjà Fu Ap­proach

Déjà Fu uses a com­bin­a­tion of all of these ap­proaches, in­cluding some tweaks to the order of schedule ex­plor­a­tion to try to find bugs sooner rather than later, when there are any.

The al­gorithm used by the standard testing func­tions is pre-emp­tion bounded par­tial-order re­duc­tion, with a bound of 2, using blocking avoid­ance (note that that doesn’t re­duce the number of sched­ules when used in con­juc­tion with bounded par­tial-order re­duc­tion!) and sleep sets. Enough of the in­ternals are ex­posed to allow im­ple­menting your own bound func­tion, such as fair­ness bounding or delay bound­ing.

To give some fig­ures, here’s the ef­fect of every im­prove­ment to the al­gorithm for runPar $ parMap id [1..5] with the Par mon­ad’s “dir­ect” sched­uler:

  • Pre-emp­tion bound­ing: 12539 unique sched­ules,
  • + blocking avoid­ance: 11400,
  • + par­tial-order re­duc­tion: 8181,
  • + sleep sets: 2237.

As can be seen, sleep sets are a massive im­prove­ment in this case, and I would wager that just pre-emp­tion bounding with sleep sets would also see a sim­ilar im­prove­ment. Ob­vi­ously this is a very simple ex­ample, with little com­mu­nic­a­tion between threads, and so can’t really be gen­er­al­ised to other cases, but it’s a nice res­ult.

  1. P. Thom­son, A. F. Don­ald­son, and A. Betts. Con­cur­rency Testing Using Schedule Bound­ing: an Em­pir­ical Study. In Pro­ceed­ings of the 19th ACM SIG­PLAN sym­posium on Prin­ciples and Prac­tice of Par­allel Pro­gram­ming, pages 15–28. ACM, 2014.

concurrency, programming, research
Target Audience
Mostly me, & possibly concurrency testing nerds.
Epistemic Status
Summarises existing research.