`Eq`

instance, but which enables faster implementations of methods if you have an `Ord`

instance around. It’s not complely transparent—functions which construct an entirely new set will have `Eq`

and `Ord`

variants—but it’s better than duplicating every single function.
`FlexiSet`

typeI’m going to call my set type `FlexiSet`

, because it’s a flexible set00I am good at naming things.

.

A `FlexiSet`

is either a list of values of a type which has an `Eq`

instance, or a set (from our old friend `Data.Set`

) of values of a type which has an `Ord`

instance:

{-# LANGUAGE GADTs #-} import Prelude hiding (filter, null) import qualified Data.List as L import qualified Data.Set as S -- | A flexible set: elements will have at least an 'Eq' instance, -- maybe also an 'Ord' instance. data FlexiSet a where EqSet :: Eq a => [a] -> FlexiSet a OrdSet :: Ord a => S.Set a -> FlexiSet a -- | Get the values from a 'FlexiSet'. The order of the resultant -- list is arbitrary. toList :: FlexiSet a -> [a] toList (EqSet as) = as toList (OrdSet as) = S.toList as

The `Eq`

and `Ord`

constraints don’t leak outside the type, they’re entirely contained. When an `EqSet`

or `OrdSet`

value is constructed, the constraint dictionary is captured as well. So pattern matching on a `FlexiSet`

will bring the instance into scope, without needing to include the constraint in the function signature. Great! Leaky constraints are the main reason why people don’t like type-constrained `data`

declarations.

We do need `Eq`

-aware and `Ord`

-aware functions to construct a `FlexiSet`

:

-- | Construct a 'FlexiSet' for a type @a@ which has an 'Eq' -- instance. -- -- If @a@ has an 'Ord' instance, use 'makeOrdSet' instead. makeEqSet :: Eq a => [a] -> FlexiSet a makeEqSet = EqSet . L.nub -- | Construct a 'FlexiSet' for a type @a@ which has an 'Ord' -- instance. makeOrdSet :: Ord a => [a] -> FlexiSet a makeOrdSet = OrdSet . S.fromList

Sadly we also need specific functions for mapping, as we need to constrain the result type of the map function. This also means we can’t give `FlexiSet`

a `Functor`

instance (just like `Set`

can’t have one):

-- | Map a function over a 'FlexiSet'. -- -- If the result type has an 'Ord' instance, use 'mapOrd' instead. -- -- This is @O(n)@. mapEq :: Eq b => (a -> b) -> FlexiSet a -> FlexiSet b mapEq f = EqSet . L.nub . map f . toList -- | Map a function over a 'FlexiSet'. -- -- This is @O(n)@. mapOrd :: Ord b => (a -> b) -> FlexiSet a -> FlexiSet b mapOrd f = OrdSet . S.fromList . map f . toList

But we *don’t* need to know anything about constraints for filtering!11As an aside, I really like Ruby’s `select`

/ `reject`

names for `filter`

and `\f -> filter (not . f)`

. I often misremember whether `filter`

keeps things satisfying the predicate, or rejects things satisfying the predicate.

This is because we’re not changing the type of value in the `FlexiSet`

, and by pattern matching on it we bring the instance into scope:

-- | Remove values from a 'FlexiSet' which fail to satisfy the given -- predicate. -- -- This is @O(n)@. filter :: (a -> Bool) -> FlexiSet a -> FlexiSet a filter f (EqSet as) = EqSet (L.filter f as) filter f (OrdSet as) = OrdSet (S.filter f as)

Here are a few more functions which do something with a single `FlexiSet`

. Note how the `OrdSet`

ones can use the more efficient `Data.Set`

operations, but the `EqSet`

ones are stuck with slow linear-time list functions:

-- | Insert a value into a 'FlexiSet' if it's not already present. -- -- This is @O(n)@ for 'Eq'-based sets and @O(log n)@ for 'Ord'-based -- sets. insert :: a -> FlexiSet a -> FlexiSet a insert a (EqSet as) = EqSet (L.nub (a:as)) insert a (OrdSet as) = OrdSet (S.insert a as) -- | Remove a value from a 'FlexiSet' if it's present -- -- This is @O(n)@ for 'Eq'-based sets and @O(log n)@ for 'Ord'-based -- sets. delete :: a -> FlexiSet a -> FlexiSet a delete a (EqSet as) = EqSet (L.filter (/=a) as) delete a (OrdSet as) = OrdSet (S.delete a as) -- | Check if a value is present in a 'FlexiSet'. -- -- This is @O(n)@ for 'Eq'-based sets and @O(log n)@ for 'Ord'-based -- sets. member :: a -> FlexiSet a -> Bool member a (EqSet as) = any (==a) as member a (OrdSet as) = S.member a as

Sometimes it doesn’t matter whether we have an `EqSet`

or an `OrdSet`

:

-- | Check if a 'FlexiSet' is empty. -- -- This is @O(1)@. null :: FlexiSet a -> Bool null (EqSet as) = L.null as null (OrdSet as) = S.null as

Sometimes it matters a lot:

-- | Get the number of elements in a 'FlexiSet'. -- -- This is @O(n)@ for 'Eq'-based sets and @O(1)@ for 'Ord'-based -- sets. size :: FlexiSet a -> Int size (EqSet as) = length as size (OrdSet as) = S.size as

We could improve this case by changing our `EqSet`

representation to also track the length.

Functions which combine two `FlexiSet`

values of the same type are interesting, as we get to “upgrade” from an `EqSet`

to an `OrdSet`

in some cases:

-- | Take the union of two 'FlexiSet' values. -- -- This is @O(n)@ if both or either of the sets are 'Eq'-based and -- @O(m*log(n/m + 1)), m <= n@ if both are 'Ord'-based. -- -- If one set is 'Eq'-based and one is 'Ord'-based, the result will -- be 'Ord'-based. union :: FlexiSet a -> FlexiSet a -> FlexiSet a union (EqSet as) (EqSet bs) = EqSet (L.nub (as ++ bs)) union (EqSet as) (OrdSet bs) = OrdSet (S.union (S.fromList as) bs) union (OrdSet as) (EqSet bs) = OrdSet (S.union as (S.fromList bs)) union (OrdSet as) (OrdSet bs) = OrdSet (S.union as bs) -- | Take the intersection of two 'FlexiSet' values. -- -- This is @O(n)@ if both or either of the sets are 'Eq'-based and -- @O(m*log(n/m + 1)), m <= n@ if both are 'Ord'-based. -- -- If one set is 'Eq'-based and one is 'Ord'-based, the result will -- be 'Ord'-based. intersection :: FlexiSet a -> FlexiSet a -> FlexiSet a intersection (EqSet as) (EqSet bs) = EqSet (L.filter (`elem` bs) as) intersection (EqSet as) (OrdSet bs) = OrdSet (S.intersection (S.fromList as) bs) intersection (OrdSet as) (EqSet bs) = OrdSet (S.intersection as (S.fromList bs)) intersection (OrdSet as) (OrdSet bs) = OrdSet (S.intersection as bs) -- | Take the intersection of two 'FlexiSet' values. -- -- This is @O(n)@ if both or either of the sets are 'Eq'-based and -- @O(m*log(n/m + 1)), m <= n@ if both are 'Ord'-based. -- -- If one set is 'Eq'-based and one is 'Ord'-based, the result will -- be 'Ord'-based. difference :: FlexiSet a -> FlexiSet a -> FlexiSet a difference (EqSet as) (EqSet bs) = EqSet (L.filter (`notElem` bs) as) difference (EqSet as) (OrdSet bs) = OrdSet (S.difference (S.fromList as) bs) difference (OrdSet as) (EqSet bs) = OrdSet (S.difference as (S.fromList bs)) difference (OrdSet as) (OrdSet bs) = OrdSet (S.difference as bs)

But when we combine sets of different types, we have to “downgrade” from an `OrdSet`

to an `EqSet`

:

-- | Take the disjoint union of two 'FlexiSet' values. -- -- This is @O(n)@ if both or either of the sets are 'Eq'-based and -- @O(m*log(n/m + 1)), m <= n@ if both are 'Ord'-based. -- -- If one set is 'Eq'-based and one is 'Ord'-based, the result will -- be 'Eq'-based. disjointUnion :: FlexiSet a -> FlexiSet b -> FlexiSet (Either a b) disjointUnion (EqSet as) (EqSet bs) = EqSet (map Left as ++ map Right bs) disjointUnion (EqSet as) (OrdSet bs) = EqSet (map Left as ++ map Right (S.toList bs)) disjointUnion (OrdSet as) (EqSet bs) = EqSet (map Left (S.toList as) ++ map Right bs) disjointUnion (OrdSet as) (OrdSet bs) = OrdSet (S.disjointUnion as bs)

GADTs are a neat generalisation of regular Haskell data types which allow you to do all sorts of cool things.

For example, in dejafu, my concurrency testing library, I’m using GADTs to:

…unify a few different variations on the same type behind a common interface (source):

-- | A representation of a concurrent program for testing. -- -- To construct these, use the 'C.MonadConc' instance, or see -- 'Test.DejaFu.Conc.withSetup', 'Test.DejaFu.Conc.withTeardown', and -- 'Test.DejaFu.Conc.withSetupAndTeardown'. -- -- @since 2.0.0.0 data Program pty n a where ModelConc :: { runModelConc :: (a -> Action n) -> Action n } -> Program Basic n a WithSetup :: { wsSetup :: ModelConc n x , wsProgram :: x -> ModelConc n a } -> Program (WithSetup x) n a WithSetupAndTeardown :: { wstSetup :: ModelConc n x , wstProgram :: x -> ModelConc n y , wstTeardown :: x -> Either Condition y -> ModelConc n a } -> Program (WithSetupAndTeardown x y) n a

…hide a type variable which doesn’t need to be exposed (source):

-- | A buffered write is a reference to the variable, and the value to -- write. Universally quantified over the value type so that the only -- thing which can be done with it is to write it to the reference. data BufferedWrite n where BufferedWrite :: ThreadId -> ModelIORef n a -> a -> BufferedWrite n

…in a few places (source):

-- | How to explore the possible executions of a concurrent program. -- -- @since 0.7.0.0 data Way where Systematic :: Bounds -> Way Randomly :: RandomGen g => (g -> (Int, g)) -> g -> Int -> Way

In all these cases GADTs let me be more specific about what type information leaks out of a constructor, meaning I can have types which more precisely convey my intent, and not just types which are full of implementation details.

]]>`@media`

queries) or inheritance, or even validate the properties and values.
import Control.Arrow (second) import Data.Char (isSpace) import Data.List (dropWhile, dropWhileEnd, inits, last, partition) import Numeric.Natural import qualified Data.List.NonEmpty as NE import qualified Data.Map as M

A miserable little pile of properties.

A CSS rule consists of:

- A
*selector*, which determines what the rule applies to. - A list of
*property / value*pairs, which determine its effect.

Some attributes are special, like `background`

, `border`

, or `margin`

, in that they’re a shorthand to specify multiple attributes in one go. For example, these rules are all equivalent:

div { margin: 1em 2em; } div { margin: 1em 2em 1em 2em; } div { margin-top: 1em; margin-right: 2em; margin-bottom: 1em; margin-left: 2em; }

I don’t know what the proper terms for these different forms are, so I’ll call them *shorthand properties* and *canonical properties*. `margin`

is a shorthand property. `margin-top`

is a canonical property. Shorthand properties are a form of syntactic sugar.

Rules with multiple properties are also a form of syntactic sugar. This rule:

div { margin-top: 1em; margin-right: 2em; margin-bottom: 1em; margin-left: 2em; }

Is equivalent to these four rules:

div { margin-top: 1em; } div { margin-right: 2em; } div { margin-bottom: 1em; } div { margin-left: 2em; }

Obviously you would never write CSS like that, *but* it is going to simplify our code. So the rest of the memo will assume all rules have exactly one property, and that property is canonical.

So, what is a rule? It’s a selector, a property name, a property value, and an importance flag:

-- | A CSS rule. data Rule = Rule { rSelector :: Selector -- ^ What the rule matches. , rBody :: (PropName, PropValue) -- ^ The effect of the rule. , rIsImportant :: Bool -- ^ Does the rule have an @!important@ annotation? } deriving Show

Selectors are a bit tricky, so to simplify the implementation I’m skipping some of the combinators:

-- | CSS selectors, but to simplify the implementation I'm omitting -- the list combinator (@A, B@), sibling combinators (@A + B@ and -- @A ~ B@), and column combinator (@A || B@). data Selector = SUniversal -- ^ @*@ - matches everything. | SElement ElName -- ^ @elementname@ - matches that element. | SClass AttrValue -- ^ @.classname@ - matches elements with that class. | SId AttrValue -- ^ @#idname@ - matches elements with that ID. | SAttribute AttrName AttrValue -- ^ @[name=value]@ - matches elements with that attribute. | SChild Selector Selector -- ^ @A > B@ - matches elements which match @B@, if their parent -- matches @A@. | SDescendent Selector Selector -- ^ @A B@ - matches elements which match @B@, if at least one of -- their ancestors matches @A@. deriving Show

An element has a name and a bunch of attributes:

-- | An HTML element. data Element = Element { eName :: ElName -- ^ The name of the element. , eAttributes :: M.Map AttrName AttrValue -- ^ Its attributes. } deriving Show

And finally we have some `String`

newtypes to avoid confusion:

newtype ElName = ElName String deriving (Eq, Ord, Show) newtype AttrName = AttrName String deriving (Eq, Ord, Show) newtype AttrValue = AttrValue String deriving (Eq, Ord, Show) newtype PropName = PropName String deriving (Eq, Ord, Show) newtype PropValue = PropValue String deriving (Eq, Ord, Show)

Given a path through the document tree—a nonempty sequence of elements—we can check if a selector matches:

-- | Check if a selector matches an element, given as a path through -- the HTML tree. To simplify the implementation this doesn't match -- pseudo-selectors. -- -- The implementation of the descendent selector is not very -- efficient. match :: NE.NonEmpty Element -> Selector -> Bool match path sel0 = match' (NE.init path) (NE.last path) sel0 where match' _ _ SUniversal = True match' _ el (SElement n) = eName el == n match' _ el (SClass v) = checkAttr el (AttrName "class") v match' _ el (SId v) = checkAttr el (AttrName "id") v match' _ el (SAttribute k v) = checkAttr el k v match' [] _ (SChild _ _) = False match' [] _ (SDescendent _ _) = False match' prefix el (SChild selA selB) = let parentPrefix = init prefix parent = last prefix in match' prefix el selB && match' parentPrefix parent selA match' prefix el (SDescendent selA selB) = match' prefix el selB && any (\(parentPrefix, parent) -> match' parentPrefix parent selA) (parents prefix) parents = map (\els -> (init els, last els)) . tail . inits checkAttr el k v = M.lookup k (eAttributes el) == Just v

“The cascade” is the algorithm which resolves rule conflicts. Each page is made up of a few different stylesheets: the user-agent stylesheet built into the browser, the user’s personal stylesheet, and the author’s stylesheet. There are also inline styles defined on HTML elements themselves. Rules are applied in this order:

- Non-
`!important`

rules from the user-agent stylesheet - Non-
`!important`

rules from the user’s stylesheet - Non-
`!important`

rules from the author’s stylesheet - Animation rules
`!important`

rules from the author’s stylesheet`!important`

rules from the user’s stylesheet`!important`

rules from the user-agent stylesheet- Transition rules

Inline styles are effectively `!important`

rules applied between steps 4 and 5. If two rules conflict, the one with the highest *specificity* (see next section) wins.

-- | The cascade algorithm. -- -- The implementation of multiple stylesheets is not very efficient. cascade :: [Rule] -- ^ User-agent stylesheet -> [Rule] -- ^ User's stylesheet -> [Rule] -- ^ Author's stylesheet -> NE.NonEmpty Element -> M.Map PropName PropValue cascade userAgent user author el = fmap snd . overrideProperties important . overrideProperties normal $ inlineStyle (NE.last el) where normal = prepare $ filter (not . rIsImportant) userAgent ++ filter (not . rIsImportant) user ++ filter (not . rIsImportant) author important = prepare $ filter rIsImportant author ++ filter rIsImportant user ++ filter rIsImportant userAgent prepare rules = [ (specificity sel important (Just position), body) | (position, Rule { rSelector = sel, rBody = body, rIsImportant = important }) <- zip [0..] rules , match el sel ] overrideProperties = flip (foldr override) where override (spec, (pname, pvalue)) props = case M.lookup pname props of Just (spec', _) | spec <= spec' -> props _ -> M.insert pname (spec, pvalue) props

A bit less grand than the name implies, to me.

Specificity determines how well a rule matches an element. The MDN docs are… okay. This diagram is much better.

Specificity can be thought of as a total lexicographical ordering on 5-tuples:

-- | The specificity of a rule. data Specificity = Specificity { sIsImportant :: Bool -- ^ Whether the rule is important or not. Strictly speaking this -- isn't really necessary, because the cascade does important -- rules after non-important rules. , sNumIdSelectors :: Natural -- ^ The number of ID selectors. , sNumAttrSelectors :: Natural -- ^ The number of class selectors, attribute selectors, and -- pseudo-classes. , sNumElSelectors :: Natural -- ^ The number of element selectors and pseudo-elements. , sSourcePosition :: Maybe Natural -- ^ The position in the stylesheet, with the first-defined rule -- having position zero. This is a @Maybe@ because inline styles -- don't have a source position. } deriving (Eq, Ord, Show)

The default derived ordering compares the fields in order, which is what we want.

An inline rule doesn’t really have a specificity according to the MDN docs (as I read it): inline rules trump everything non-`!important`

, and are trumped by everything `!important`

. But, rather than have them be a special case, they can be given a specificity:

-- | The specificity of an inline rule. inlineSpecificity :: Specificity inlineSpecificity = Specificity { sIsImportant = True , sNumIdSelectors = 0 , sNumAttrSelectors = 0 , sNumElSelectors = 0 , sSourcePosition = Nothing }

Getting the actual inline rules involves a bit of string-wrangling:

inlineStyle :: Element -> M.Map PropName (Specificity, PropValue) inlineStyle el = case M.lookup (AttrName "style") (eAttributes el) of Just (AttrValue style) -> M.fromList [ (PropName (trim k), (inlineSpecificity, PropValue (trim v))) | prop <- splitOn (==';') style , let (k, (':':v)) = break (==':') prop ] Nothing -> M.empty where splitOn p s = case dropWhile p s of [] -> [] s' -> let (w, s'') = break p s' in w : splitOn p s'' trim = dropWhile isSpace . dropWhileEnd isSpace

Using that helpful diagram, we can compute the specificity of a selector, if we know whether its rule is important and its source position:

-- | Compute the specificity of a selector. specificity :: Selector -> Bool -> Maybe Natural -> Specificity specificity sel0 important position = go spec0 sel0 where go spec SUniversal = spec go spec (SElement _) = spec { sNumElSelectors = sNumElSelectors spec + 1 } go spec (SClass _) = spec { sNumAttrSelectors = sNumAttrSelectors spec + 1 } go spec (SId _) = spec { sNumIdSelectors = sNumIdSelectors spec + 1 } go spec (SAttribute _ _) = spec { sNumAttrSelectors = sNumAttrSelectors spec + 1 } go spec (SChild selA selB) = let spec' = go spec selA in go spec' selB go spec (SDescendent selA selB) = let spec' = go spec selA in go spec' selB spec0 = Specificity { sIsImportant = important , sNumIdSelectors = 0 , sNumAttrSelectors = 0 , sNumElSelectors = 0 , sSourcePosition = position }

An interesting property of specificity, which I’m sure has confused me in the past, is that proximity doesn’t matter. In this HTML:

<div> <p><span>Hello world</span></p> </div>

All of these rules match the `span`

, and are all equally specific (ignoring source position):

div span { color: red; } p span { color: blue; } p > span { color: green; }

And, as the universal selector doesn’t affect specificity, these rules are all equally specific (ignoring source position):

span { color: aqua; } * span { color: linen; } * * span { color: violet; }

In both sets of rules, which one wins is determined by source position, nothing more. Now I see why frontend devs tend to use a lot of class and ID selectors, and not so many child or descendent selectors.

Now is the time to try out some CSS rules, and to check if Firefox agrees with me.

Here is an HTML document:

<html> <head> <style> body { font-size: 16px; } h1 { font-weight: bold; } body > * { font-weight: normal; } div { color: red; } div span { color: green; } div div p span { color: black; } .inner { font-weight: bold !important; } .inner { font-weight: normal; } div div { background-color: black; } .inner { background-color: white; } </style> </head> <body> <h1>Hello, world!</h1> <div class="outer"> <div class="inner"> <p> <span>Line of text</span> </p> </div> </div> </body> </html>

The style rules correspond to this Haskell:

authorRules :: [Rule] authorRules = [ r (SElement (ElName "body")) "font-size" "16px" False , r (SElement (ElName "h1")) "font-weight" "bold" False , r (SChild (SElement (ElName "body")) SUniversal) "font-weight" "normal" False , r (SElement (ElName "div")) "color" "red" False , r (SDescendent (SElement (ElName "div")) (SElement (ElName "span"))) "color" "green" False , r (SDescendent (SDescendent (SDescendent (SElement (ElName "div")) (SElement (ElName "div"))) (SElement (ElName "p"))) (SElement (ElName "span"))) "color" "black" False , r (SClass (AttrValue "inner")) "font-weight" "bold" True , r (SClass (AttrValue "inner")) "font-weight" "normal" False , r (SDescendent (SElement (ElName "div")) (SElement (ElName "div"))) "background-color" "black" False , r (SClass (AttrValue "inner")) "background-color" "white" False ] where r sel k v = Rule sel (PropName k, PropValue v)

And the relevant bit of Firefox’s user-agent stylesheet is:

userAgentRules :: [Rule] userAgentRules = [ r (SElement (ElName "h1")) "font-size" "2em" False ] where r sel k v = Rule sel (PropName k, PropValue v)

I don’t have a user stylesheet:

userRules :: [Rule] userRules = []

A bit verbose…

We can now get the computed style for every element:

computedStyles :: [M.Map PropName PropValue] computedStyles = map (cascade userAgentRules userRules authorRules) paths where paths = map NE.fromList [ [e "body" []] , [e "body" [], e "h1" []] , [e "body" [], e "div" [("class", "outer")]] , [e "body" [], e "div" [("class", "outer")], e "div" [("class", "inner")]] , [e "body" [], e "div" [("class", "outer")], e "div" [("class", "inner")], e "p" []] , [e "body" [], e "div" [("class", "outer")], e "div" [("class", "inner")], e "p" [], e "span" []] ] e n kvs = Element (ElName n) (M.fromList [(AttrName k, AttrValue v) | (k, v) <- kvs])

Which gives these styles:

<html> <body style="font-size: 16px"> <h1 style="font-size: 2em; font-weight: normal">Hello, world!</h1> <div style="color: red; font-weight: normal"> <div style="background-color: white; color: red; font-weight: bold"> <p> <span style="color: black">Line of text</span> </p> </div> </div> </body> </html>

Which Firefox agrees with!

]]>Here are a pair of constraints from the Solving Scheduling Problems with Integer Linear Programming memo about fairly distributing rota assignments amongst the available people:

\[ \forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } \forall r \in \mathcal R \text{, } X_p \geqslant A_{tpr} \] \[ \forall p \in \mathcal P \text{, } X_p \leqslant \sum_{t \in \mathcal T} \sum_{r \in \mathcal R} A_{tpr} \]It’s a bit dense, but it only has the necessary information. Now here’s the corresponding Python:

for slot in range(slots): for person in people: for role in roles: problem += is_assigned[person] >= assignments[slot, person, role] for person in people: problem += is_assigned[person] <= pulp.lpSum(assignments[slot, person, role] for slot in range(slots) for role in roles)

That’s a bit more verbose, takes up more space, we’ve got this `pulp.lpSum`

thing, this mysterious `problem`

variable.. I’d prefer to be able to write an ASCII equivalent of the mathematical form, and have the Python generated for me.

The source file for this memo is Literate Haskell, you can load it directly into GHCi. So here’s the necessary ceremony:

{-# LANGUAGE LambdaCase #-} {-# LANGUAGE GADTs #-} import Control.Monad.Trans.Class import Control.Monad.Trans.Except import Control.Monad.Trans.Reader import Control.Monad.Trans.State import Data.Foldable (for_) import Data.List (intercalate, sort) import Data.Maybe (listToMaybe)

Let’s begin!

My main motivation when coming up with the concrete syntax was “it should look like the maths, but be ASCII, and not be LaTeX because that would be a pain”. It should also be concise, in particular it shouldn’t be necessary to specify what type quantifiers range over (that should be inferred).

I’m going to use the basic rota generator described in the other memo as a running example.

An ILP language is, necessarily, not very expressive. So I decided on the following types:

- User-defined types (sets of values)
- Predicates
- Integers
- N-dimensional binary arrays
- N-dimensional integer arrays00I’ve not implemented these because the example didn’t need them.

Predicates take parameters and arrays take indices, the types of these will also be modelled. A predicate which takes two integers is a different type to a predicate which takes an integer and a set-value.

We also want to distinguish between “parameter” variables, which the user of the model will supply, and “model” variables, which the model will solve for.

Here’s an example with some comments:

-- Define three new types type TimeSlot, Person, Role -- M is an input of type integer param integer M -- is_leave is an input of type (TimeSlot, Person) -> bool param predicate is_leave(TimeSlot, Person) -- A is a 3D binary array the solver will try to produce model binary A[TimeSlot, Person, Role] -- X is a 1D binary array (or "vector" if you like special-case -- terminology...) the solver will try to produce model binary X[Person]

Now we have our constraints and objective function:

-- In every time slot, each role is assigned to exactly one person forall t, r; sum{p} A[t,p,r] = 1 -- Nobody is assigned multiple roles in the same time slot forall t, p; sum{r} A[t,p,r] <= 1 -- Nobody is assigned a role in a slot they are on leave for forall p, t if is_leave(t,p), r; A[t,p,r] = 0 -- Nobody works too many shifts forall p; sum{t, r} A[t,p,r] <= M -- Assignments are fairly distributed forall t, p, r; X[p] >= A[t,p,r] forall p; X[p] <= sum{t} r A[t,p,r] maximise sum{p} X[p]

Look how concise they are! They don’t reference any types either!

Even though `forall`

and `sum`

are conceptually similar (bring a new variable into scope and do some sort of quantification) I picked different syntax for them because `forall`

introduces multiple actual constraints: one for each value of the user-defined type being quantified over. The `forall`

quantifier is part of the meta-language, the `sum`

quantifier is part of the ILP language.

Let’s talk abstract syntax. I didn’t want to write a parser, so we’ll skip over that. We’re now getting to our first real bit of Haskell code.

In the concrete syntax there’s only one `forall`

and it’s followed by a list of variables to quantify over, and then the constraint. To simplify the implementation, the abstract-syntax-`CForall`

has exactly one variable, and can either be followed by another `CForall`

or a `CCheck`

(the bit like `sum{r} A[t,p,r] <= 1`

).

A `CForall`

also contains an optional predicate restriction, which is expressed as the predicate’s name followed by the list of arguments.

type Name = String data Constraint a = CForall (TypedName a) (Maybe (Name, [Name])) (Constraint a) | CCheck Op (Expression a) (Expression a) deriving Eq data Op = OEq | OLt | OGt | OLEq | OGEq deriving Eq

We’ll talk about the `TypedName`

bit in the next section, but it’s essentially the name of the quantifier variable. Here are some examples, where `E1`

and `E2`

are placeholders for expressions:

**concrete:**`E1 <= E2`

**abstract:**`CCheck OEq E1 E2`

**concrete:**`forall x; E1 = E2`

**abstract:**`CForall (Untyped "x") Nothing (CCheck OEq E1 E2)`

**concrete:**`forall x, y if p(x, y); E1 < E2`

**abstract:**`CForall (Untyped "x") Nothing (CForall (Untyped "y") (Just ("p", ["x", "y"])) (CCheck OLt E1 E2))`

The expression language is a bit richer, there are more forms of expressions than there are constraints:

data Expression a = ESum (TypedName a) (Expression a) | EIndex Name [Name] | EVar Name | EConst Integer | EMul Integer (Expression a) | EAdd (Expression a) (Expression a) deriving Eq

Here are some more examples:

**concrete:**`1 + 2`

**abstract:**`EAdd (EConst 1) (EConst 2)`

**concrete:**`X[y,z]`

**abstract:**`EIndex "X" ["y", "z"]`

**concrete:**`3 * sum{i} X[i]`

**abstract:**`EMul 3 (ESum (Untyped "i") (EIndex "X" ["i"]))`

Reading terms expressed in this abstract syntax would be a bit of a pain, so here’s some pretty-printing:

instance Show (Constraint a) where show (CForall tyname (Just (rname, rargs)) c) = "forall " ++ show tyname ++ " if " ++ rname ++ "(" ++ strings rargs ++ "); " ++ show c show (CForall tyname _ c) = "forall " ++ show tyname ++ "; " ++ show c show (CCheck op expr1 expr2) = show expr1 ++ " " ++ show op ++ " " ++ show expr2 instance Show (Expression a) where show (ESum tyname expr) = "sum{" ++ show tyname ++ "} " ++ show expr show (EIndex name args) = name ++ "[" ++ strings args ++ "]" show (EVar name) = name show (EConst i) = show i show (EMul i expr) = show i ++ " * " ++ show expr show (EAdd expr1 expr2) = show expr1 ++ " + " ++ show expr2 instance Show Op where show OEq = "=" show OLt = "<" show OGt = ">" show OLEq = "<=" show OGEq = ">="

It looks like this:

λ> CForall (Untyped "x") Nothing (CForall (Untyped "y") (Just ("p", ["x", "y"])) (CCheck OLt (EMul 3 (ESum (Untyped "i") (EIndex "X" ["i"]))) (EConst 10))) forall x; forall y if p(x, y); 3 * sum{i} X[i] < 10

The `strings`

helper function used in `CForall`

and `EIndex`

just comma-separates a list of strings:

strings :: [String] -> String strings = intercalate ", "

Previously, I said we would have these types:

- User-defined types
- Predicates
- Integers
- N-dimensional binary arrays
- N-dimensional integer arrays (not actually implemented)

And we also need to distinguish between “parameter” variables and “model” variables. ILP solvers only operate on matrices, so actually what we have are three parameter types:

- User-defined types
- Predicates
- Integers

And two model types:

- N-dimensional binary arrays
- N-dimensional integer arrays

data Ty = ParamCustom Name | ParamInteger | ParamPredicate [Ty] | ModelBinary [Ty] deriving Eq

Remember the `TypedName`

in the constraint and expression abstract syntax? It was used wherever a new name was brought into scope: `CForall`

and `ESum`

. A `TypedName`

is either a `Name`

by itself or a `Name`

associated with a `Ty`

:

data IsTyped data IsUntyped data TypedName a where Untyped :: Name -> TypedName IsUntyped Typed :: Name -> Ty -> TypedName IsTyped instance Eq (TypedName a) where Untyped n1 == Untyped n2 = n1 == n2 Typed n1 ty1 == Typed n2 ty2 = n1 == n2 && ty1 == ty2

When generating code, we’ll need to know which types are being quantified over. So the type checker will fill in the types as it goes, turning our *untyped* expressions and constraints into *typed* expressions and constraints.

type UntypedConstraint = Constraint IsUntyped type UntypedExpression = Expression IsUntyped type TypedConstraint = Constraint IsTyped type TypedExpression = Expression IsTyped

And let’s add some pretty-printing for types too:

instance Show Ty where show (ParamCustom name) = "param<" ++ name ++ ">" show ParamInteger = "param<integer>" show (ParamPredicate args) = "param<predicate(" ++ strings (map show args) ++ ")>" show (ModelBinary args) = "model<binary[" ++ strings (map show args) ++ "]>" instance Show (TypedName a) where show (Untyped name) = name show (Typed name ty) = show ty ++ " " ++ name

Our running example is the set of basic rota constraints from the other memo. We’ve already seen the concrete syntax, here’s the abstract syntax:

type Binding = (Name, Ty) globals :: [Binding] globals = [ ("M", ParamInteger) , ("is_leave", ParamPredicate [ParamCustom "TimeSlot", ParamCustom "Person"]) , ("A", ModelBinary [ParamCustom "TimeSlot", ParamCustom "Person", ParamCustom "Role"]) , ("X", ModelBinary [ParamCustom "Person"]) ] constraints :: [UntypedConstraint] constraints = [ -- In every time slot, each role is assigned to exactly one person CForall (Untyped "t") Nothing (CForall (Untyped "r") Nothing (CCheck OEq (ESum (Untyped "p") (EIndex "A" ["t", "p", "r"])) (EConst 1))) -- Nobody is assigned multiple roles in the same time slot , CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CCheck OLEq (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"])) (EConst 1))) -- Nobody is assigned a role in a slot they are on leave for , CForall (Untyped "p") Nothing (CForall (Untyped "t") (Just ("is_leave", ["t", "p"])) (CForall (Untyped "r") Nothing (CCheck OEq (EIndex "A" ["t", "p", "r"]) (EConst 0)))) -- Nobody works too many shifts , CForall (Untyped "p") Nothing (CCheck OLEq (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"]))) (EVar "M")) -- Assignments are fairly distributed , CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CForall (Untyped "r") Nothing (CCheck OGEq (EIndex "X" ["p"]) (EIndex "A" ["t", "p", "r"])))) , CForall (Untyped "p") Nothing (CCheck OLEq (EIndex "X" ["p"]) (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"])))) ]

That’s pretty verbose, more than the Python! Good thing that I’d write a parser for this if I were doing it for real.

This is the hairy bit of the memo. I’ve not gone for any particular type inference algorithm, I just went for the straightforward way to do it for the syntax and types I had.

We’ll use a monad stack for the type checker:

type TcFun = ReaderT [Binding] (StateT [Binding] (Except String)) -- environment ^^^^^^^^^ -- unresolved free variables ^^^^^^^^^ -- error message ^^^^^^

To get a feel for how `TcFun`

is useful, let’s go through some utility functions.

**Type errors:**

typeError :: String -> TcFun a typeError = lift . lift . throwE eExpected :: String -> Name -> Maybe Ty -> TcFun a eExpected eTy name (Just aTy) = typeError $ "Expected " ++ eTy ++ " variable, but '" ++ name ++ "' is " ++ show aTy ++ " variable." eExpected eTy name Nothing = typeError $ "Expected " ++ eTy ++ " variable, but could not infer a type for '" ++ name ++ "'."

Throwing a type error is pretty important, so we’ll need a function for that, and for one of the more common errors.

**Looking up the type of a name (if it’s bound):**

getTy :: Name -> TcFun (Maybe Ty) getTy name = lookup name <$> ask

The bindings in the state are just to keep track of free variables, and are not used when checking something’s type.

**Running a subcomputation with a name removed from the environment:**

withoutBinding :: Name -> TcFun a -> TcFun a withoutBinding name = withReaderT (remove name)

For example, if we have a global `x`

and a constraint `forall x; A[x]`

, the `x`

in `A[x]`

is not the global `x`

; it’s the `x`

bound by the `forall`

. Don’t worry, it’s only removed while typechecking the body of the `CForall`

or `ESum`

which introduced the new binding.

**Asserting a variable has a type:**

assertType :: Name -> Ty -> TcFun () assertType name eTy = getTy name >>= \case Just aTy | eTy == aTy -> pure () | otherwise -> eExpected (show eTy) name (Just aTy) Nothing -> lift $ modify ((name, eTy):)

Takes a name and an expected type, and checks that any pre-existing binding matches. If there is no pre-existing binding, the name is introduced as a free variable.

**Removing a free variable from the state:**

delFree :: Name -> TcFun Ty delFree name = lookup name <$> lift get >>= \case Just ty -> do lift $ modify (remove name) pure ty Nothing -> typeError $ "Could not infer a type for '" ++ name ++ "'."

Looks up the type of a free variable, removes the variable from the state, and returns the type. If we fail to find a type for the variable, it’s unused, which is a type error (as we can’t infer a concrete type).

The basic idea is to walk through the abstract syntax: unify types when they arise; and for `CForall`

and `ESum`

check that the inner constraint (or expression) has a free variable with the right name and type.

**Typechecking argument lists:**

Let’s start with the simplest case: type checking an argument list, which arises in quantifier predicate constraints and `EIndex`

. The function takes the names of the argument variables and their expected types, and checks that the variables do have those types.

typecheckArgList :: [Name] -> [Ty] -> TcFun ()

The recursive case takes the name of the current argument and its expected type. It then looks up the actual type of the name. If it has a type, check that it’s the same as the expected type and either move onto the next parameter or throw an error. If there is no binding, `assertType`

records it as a free variable.

typecheckArgList (name:ns) (expectedTy:ts) = do assertType name expectedTy typecheckArgList ns ts

Ultimately the `typecheckExpression`

and `typecheckConstraint`

functions we’ll get to later will make sure all these free variables are bound by a `forall`

or a `sum`

.

The base case is when we run out of argument names or types; there should be the same number of each:

typecheckArgList [] [] = pure () typecheckArgList ns [] = typeError $ "Expected " ++ show (length ns) ++ " fewer arguments." typecheckArgList [] ts = typeError $ "Expected " ++ show (length ts) ++ " more arguments."

**Typechecking expressions:**

Expressions have a few different parts, so let’s go through them one at a time.

typecheckExpression :: UntypedExpression -> TcFun TypedExpression typecheckExpression e0 = decorate e0 (go e0) where

The `decorate`

function, defined further below, appends the pretty-printed expression to any error message. So by `decorate`

ing every recursive call, we get an increasingly wide view of the error. Like this:

Found variable 'x' at incompatible types param<integer> and param<index>. in x in A[x] = x in forall x; A[x] = x

`ESum`

introduces a new binding. The way I’ve handled this is by *unbinding* the name (in case there was something with the same name from a wider scope), type-checking the inner expression, and then (1) asserting that there is a free variable with the name of the bound variable and (2) storing its type.

go (ESum (Untyped name) expr) = do expr' <- withoutBinding name $ typecheckExpression expr delFree name >>= \case ty@(ParamCustom _) -> pure (ESum (Typed name ty) expr') aTy -> eExpected "param<$custom>" name (Just aTy)

`EIndex`

requires checking an argument list. I’m not allowing quantifying over model variables, so in the expression `EIndex name args`

, then `name`

*must* refer to a global. All globals are of known types, so we can look up the type of the argument list from the global environment.

go (EIndex name args) = getTy name >>= \case Just (ModelBinary argtys) -> do typecheckArgList args argtys pure (EIndex name args) aTy -> eExpected "model<binary(_)>" name aTy

`EVar`

uses a variable directly, in which case the variable *must* be an integer. This is handled by looking for a binding and, if there isn’t one, introducing a new free variable.

go (EVar name) = do assertType name ParamInteger pure (EVar name)

`EConst`

, `EMul`

, and `EAdd`

are pretty simple and just involve recursive calls to `typecheckExpression`

.

go (EConst k) = pure (EConst k) go (EMul k expr) = do expr' <- typecheckExpression expr pure (EMul k expr') go (EAdd expr1 expr2) = do expr1' <- typecheckExpression expr1 expr2' <- typecheckExpression expr2 pure (EAdd expr1' expr2')

The input to `typecheckExpression`

is an `UntypedExpression`

and the output is a `TypedExpression`

. We get there by rewriting `ESum`

constructs to contain the inferred type of the quantifier variable. This will be useful when generating code.

**Typechecking constraints:**

Typechecking a constraint is pretty much the same as typechecking an expression. `CForall`

is like `ESum`

, `CCheck`

is like `EAdd`

. The only new thing is that a `CForall`

can have a predicate constraint… but that’s typechecked in the same way as an `EIndex`

: get the argument types of the predicate from the environment, and check that against the argument variables.

Here it is:

typecheckConstraint :: UntypedConstraint -> TcFun TypedConstraint typecheckConstraint c0 = decorate c0 (go c0) where go (CForall (Untyped name) (Just (rname, rargs)) c) = getTy rname >>= \case Just (ParamPredicate argtys) -> do typecheckArgList rargs argtys c' <- withoutBinding name $ typecheckConstraint c ty <- delFree name pure (CForall (Typed name ty) (Just (rname, rargs)) c') aTy -> eExpected "param<predicate(_)>" rname aTy go (CForall (Untyped name) Nothing c) = do c' <- withoutBinding name $ typecheckConstraint c ty <- delFree name pure (CForall (Typed name ty) Nothing c') go (CCheck op expr1 expr2) = do expr1' <- typecheckExpression expr1 expr2' <- typecheckExpression expr2 pure (CCheck op expr1' expr2')

While `typecheckConstraint`

works, it leaves something to be desired. Here’s a slightly nicer interface:

typecheckConstraint_ :: [Binding] -> UntypedConstraint -> Either String TypedConstraint typecheckConstraint_ env0 c0 = check =<< runExcept (runStateT (runReaderT (typecheckConstraint c0) env0) []) where check (c, []) = Right c check (_, free) = Left ("Unbound free variables: " ++ strings (sort (map fst free)) ++ ".")

This:

- Takes the global bindings as an argument.
- Does away with the
`TcFun`

, it returns a plain`Either`

. - Checks that no free variables leak out.

Some utility functions used above are:

remove :: Eq a => a -> [(a, b)] -> [(a, b)] remove a = filter ((/=a) . fst) decorate :: Show a => a -> TcFun b -> TcFun b decorate e = goR where goR m = ReaderT (goS . runReaderT m) goS m = StateT (goE . runStateT m) goE = withExcept (\err -> err ++ "\n in " ++ show e) where

Here’s a little function to print out the inferred type, or type error, for all of our constraints from the running example:

demoTypeInference :: IO () demoTypeInference = for_ constraints $ \constraint -> do case typecheckConstraint_ globals constraint of Right c' -> print c' Left err -> putStrLn err putStrLn ""

Behold!

λ> demoTypeInference forall param<TimeSlot> t; forall param<Role> r; sum{param<Person> p} A[t, p, r] = 1 forall param<TimeSlot> t; forall param<Person> p; sum{param<Role> r} A[t, p, r] <= 1 forall param<Person> p; forall param<TimeSlot> t if is_leave(t, p); forall param<Role> r; A[t, p, r] = 0 forall param<Person> p; sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] <= M forall param<TimeSlot> t; forall param<Person> p; forall param<Role> r; X[p] >= A[t, p, r] forall param<Person> p; X[p] <= sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r]

Looks pretty good, all types are inferred as they should be.

Here’s a broken example, which arose when I mistyped one of the constraints:

λ> either putStrLn print $ typecheckConstraint_ globals (CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CForall (Untyped "r") Nothing (CCheck OLEq (EIndex "X" ["p"]) (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"]))))))) Could not infer a type for 'r'. in forall r; X[p] <= sum{t} sum{r} A[t, p, r] in forall p; forall r; X[p] <= sum{t} sum{r} A[t, p, r] in forall t; forall p; forall r; X[p] <= sum{t} sum{r} A[t, p, r]

I’d added extra `forall t`

and `forall r`

quantifiers, which are wrong because those variables are bound by `sum`

s. So the types of the `forall`

-bound variables can’t be inferred.

I don’t want to write (or learn) bindings to ILP solvers, I already know PuLP so that sounds like a pain. So what I do want to do is generate the PuLP-using Python code the abstract syntax corresponds to.

Most of `codegenExpression`

, which produces a Python expression, is straightforward:

codegenExpression :: TypedExpression -> String codegenExpression (EIndex name args) = name ++ "[" ++ strings args ++ "]" codegenExpression (EVar name) = name codegenExpression (EConst i) = show i codegenExpression (EMul i expr) = show i ++ " * " ++ codegenExpression expr codegenExpression (EAdd expr1 expr2) = "(" ++ codegenExpression expr1 ++ " + " ++ codegenExpression expr2 ++ ")"

The complex bit is handling `ESum`

, which introduces a generator expression, and multiple `ESum`

s are collapsed:

codegenExpression (ESum tyname0 expr0) = go [tyname0] expr0 where go vars (ESum tyname expr) = go (tyname:vars) expr go vars e = "pulp.lpSum(" ++ codegenExpression e ++ " " ++ go' (reverse vars) ++ ")" go' [] = "" go' (Typed name (ParamCustom ty):vs) = let code = "for " ++ name ++ " in " ++ ty in if null vs then code else code ++ " " ++ go' vs

I’m making some assumptions about how variables and types are represented in Python:

I assume all names are the valid in Python, eg:

**abstract:**`EMul 3 (EIndex "A", ["i"])`

**code:**`3 * A[i]`

I assume user-defined types correspond to Python iterators, eg:

**abstract:**`ESum (Typed "x" (ParamCustom "X")) expr`

**code:**`pulp.lpSum(expr for x in X)`

.

These aren’t checked. Assumption (1) could be handled by restricting the characters in names (eg, to alphanumeric only). Assumption (2) would be handled if I were to implement the full abstract syntax, as generated code would be put in a function which takes all the parameter variables as arguments, and which creates the model variables. But this memo only implements expressions and constraints.

Generating code for constraints is nothing surprising, the only slight complication is needing to make sure the indentation works out when there are nested `CForall`

s:

codegenConstraint :: TypedConstraint -> String codegenConstraint = unlines . go where go (CForall (Typed name (ParamCustom ty)) (Just (rname, rargs)) c) = [ "for " ++ name ++ " in " ++ ty ++ ":" , " if not " ++ rname ++ "(" ++ strings rargs ++ "):" , " continue" ] ++ indent (go c) go (CForall (Typed name (ParamCustom ty)) _ c) = [ "for " ++ name ++ " in " ++ ty ++ ":" ] ++ indent (go c) go (CCheck op expr1 expr2) = let e1 = codegenExpression expr1 e2 = codegenExpression expr2 in ["problem += " ++ e1 ++ " " ++ cgOp op ++ " " ++ e2] cgOp OEq = "==" cgOp op = show op indent = map (" "++)

Here’s a little function to print out the generated code, or type error, for all of our constraints from the running example:

demoCodeGen :: IO () demoCodeGen = for_ constraints $ \constraint -> do case typecheckConstraint_ globals constraint of Right c' -> do putStrLn (" # " ++ show c') putStrLn (codegenConstraint c') Left err -> do putStrLn err putStrLn ""

Behold, again!

λ> demoCodeGen # forall param<TimeSlot> t; forall param<Role> r; sum{param<Person> p} A[t, p, r] = 1 for t in TimeSlot: for r in Role: problem += pulp.lpSum(A[t, p, r] for p in Person) == 1 # forall param<TimeSlot> t; forall param<Person> p; sum{param<Role> r} A[t, p, r] <= 1 for t in TimeSlot: for p in Person: problem += pulp.lpSum(A[t, p, r] for r in Role) <= 1 # forall param<Person> p; forall param<TimeSlot> t if is_leave(t, p); forall param<Role> r; A[t, p, r] = 0 for p in Person: for t in TimeSlot: if not is_leave(t, p): continue for r in Role: problem += A[t, p, r] == 0 # forall param<Person> p; sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] <= M for p in Person: problem += pulp.lpSum(A[t, p, r] for t in TimeSlot for r in Role) <= M # forall param<TimeSlot> t; forall param<Person> p; forall param<Role> r; X[p] >= A[t, p, r] for t in TimeSlot: for p in Person: for r in Role: problem += X[p] >= A[t, p, r] # forall param<Person> p; X[p] <= sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] for p in Person: problem += X[p] <= pulp.lpSum(A[t, p, r] for t in TimeSlot for r in Role)

We’ve come to the end of my little language for defining ILP problems, but there is still more to be done if this were to become a fully-fledged language people could use. Here are some missing bits:

- A parser for the concrete syntax.
- More integer operations:
- Integer ranges, in addition to user-defined set types, for
`forall`

and`sum`

. - Arithmetic on integer indices.
- Comparisons, in addition to predicate functions, in
`forall`

guards.

- Integer ranges, in addition to user-defined set types, for
- The rest of the abstract syntax (along with typechecking and code generation): integer matrices, objective functions, and type and variable declarations.

For example, in the GOV.UK support rota, one of the constraints is that someone can’t be on support in two adjacent weeks. With integer ranges and arithmetic on integer indices, that could be expressed like so:

forall t in [1, N), p; (sum{r} A[t, p, r]) + (sum{r} A[t - 1, p, r]) <= 1

There’s also a small problem with the current abstract syntax: it’s a bit too flexible. This is not a valid ILP expression:

sum{foo} (sum{bar} A[foo, bar] + sum{baz} B[foo, baz])

Only direct `sum`

nesting is permitted. There are two ways to solve this. One is to change the abstract syntax to preclude it, maybe something like this:

data Void data TaggedExpression tag a = TESum !tag (SumExpression a) | TEIndex Name [Name] | TEVar Name | TEConst Integer | TEMul Integer (TaggedExpression tag a) | TEAdd (TaggedExpression tag a) (TaggedExpression tag a) data SumExpression a = SENest (TypedName a) (SumExpression a) | SEBreak (TaggedExpression Void a)

A `TaggedExpression Void`

can’t contain any more `TESum`

constructors, because the `Void`

type is uninhabited. Another option is to add a check, between parsing and typechecking, that there are no invalidly nested `ESum`

s.

I’m pleased to announce a new super-major release of dejafu, a library for testing concurrent Haskell programs.

While there are breaking changes, common use-cases shouldn’t be affected too significantly (or not at all). There is a brief guide to the changes, and how to migrate if necessary, on the website.

dejafu is a unit-testing library for concurrent Haskell programs. Tests are deterministic, and work by systematically exploring the possible schedules of your concurrency-using test case, allowing you to confidently check your threaded code.

HUnit and Tasty bindings are available.

dejafu requires your test case to be written against the `MonadConc`

typeclass from the concurrency package. This is a necessity, dejafu cannot peek inside your `IO`

or `STM`

actions, so it needs to be able to plug in an alternative implementation of the concurrency primitives for testing. There is some guidance for how to switch from `IO`

code to `MonadConc`

code on the website.

If you really need `IO`

, you can use `MonadIO`

- but make sure it’s deterministic enough to not invalidate your tests!

Here’s a small example reproducing a deadlock found in an earlier version of the auto-update library:

> :{ autocheck $ do auto <- mkAutoUpdate defaultUpdateSettings auto :} [fail] Successful [deadlock] S0--------S1-----------S0- [fail] Deterministic [deadlock] S0--------S1-----------S0- () S0--------S1--------p0--

dejafu finds the deadlock, and gives a simplified execution trace for each distinct result. More in-depth traces showing exactly what each thread did are also available. This is using a version of auto-update modified to use the `MonadConc`

typeclass. The source is in the dejafu testsuite.

The highlights for this release are setup actions, teardown actions, and invariants:

**Setup actions**are for things which are not really a part of your test case, but which are needed for it (for example, setting up a test distributed system). As dejafu can run a single test case many times, repeating this work can be a significant overhead. By defining this as a setup action, dejafu can “snapshot” the state at the end of the action, and efficiently reload it in subsequent executions of the same test.**Teardown actions**are for things you want to run after your test case completes, in all cases, even if the test deadlocks (for example). As dejafu controls the concurrent execution of the test case, inspecting shared state is possible even if the test case fails to complete.**Invariants**are effect-free atomically-checked conditions over shared state which must always hold. If an invariant throws an exception, the test case is aborted, and any teardown action run.

Here is an example of a setup action with an invariant:

> :{ autocheck $ let setup = do var <- newEmptyMVar registerInvariant $ do value <- inspectMVar var when (value == Just 1) $ throwM Overflow pure var in withSetup setup $ \var -> do fork $ putMVar var 0 fork $ putMVar var 1 tryReadMVar var :} [fail] Successful [invariant failure] S0--P2- [fail] Deterministic [invariant failure] S0--P2- Nothing S0---- Just 0 S0--P1--S0--

In the `[invariant failure]`

case, thread 2 is scheduled, writing the forbidden value “1” to the MVar, which terminates the test.

Here is an example of a setup action with a teardown action:

> :{ autocheck $ let setup = newMVar () teardown var (Right _) = show <$> tryReadMVar var teardown _ (Left e) = pure (show e) in withSetupAndTeardown setup teardown $ \var -> do fork $ takeMVar var takeMVar var :} [pass] Successful [fail] Deterministic "Nothing" S0--- "Deadlock" S0-P1--S0-

The teardown action can perform arbitrary concurrency effects, including inspecting any mutable state returned by the setup action.

Setup and teardown actions were previously available in a slightly different form as the `dontCheck`

and `subconcurrency`

functions, which have been removed (see the migration guide if you used these).

Haskell typeclass instances have two parts: some *constraints*, and the *instance head*:

newtype WrappedFunctor f a = WrappedFunctor (n a) instance Functor f => Functor (WrappedFunctor f) where -- ^^^^^^^^^ constraints -- ^^^^^^^^^^^^^^^^^^^^^^^^^^ head fmap f (WrappedFunctor fa) = WrappedFunctor (fmap f fa)

More specifically, the head is of the form `C (T a1 ... an)`

, where `C`

is the class, `T`

is a type constructor, and `a1 ... an`

are distinct type variables.00The `FlexibleInstances`

extension relaxes this restriction a little, allowing some (or all) of the `a1 ... an`

to be arbitrary types, as well as type variables.

When the type checker needs to find an instance, it does so purely based on the head, constraints don’t come into it. The instance above means “whenever you use `WrappedFunctor f`

as a functor, *regardless of what f is and even if we don’t know what it is yet*, then you can use this instance”, and a type error will be thrown if whatever concrete type

`f`

is instantiated to doesn’t in fact have a functor instance.You might think that, if we didn’t define the instance above and instead defined this one:

instance Functor (WrappedFunctor Maybe) where fmap f (WrappedFunctor fa) = WrappedFunctor (fmap f fa)

…and then used a `WrappedFunctor f`

as a functor, that the type checker would infer `f`

must be `Maybe`

. This is not so! Typeclass inference happens under an “open world” approach: just because only one instance is known *at this time* doesn’t mean there won’t be a second instance discovered later. Prematurely selecting the instance for `WrappedFunctor Maybe`

could be unsound.

In GHC Haskell, we can express a constraint that two types have to be equal. For example, this is a weird way to check that two values are equal:

-- this requires GADTs or TypeFamilies funnyEq :: (Eq a, a ~ b) => a -> b -> Bool funnyEq = (==)

We only have a constraint `Eq a`

, not `Eq b`

, but because of the `a ~ b`

constraint, the type checker knows that they’re the same type:

> funnyEq 'a' 'b' False > funnyEq True True True > funnyEq True 'b' <interactive>:22:1: error: • Couldn't match type ‘Bool’ with ‘Char’ arising from a use of ‘funnyEq’ • In the expression: funnyEq True 'b' In an equation for ‘it’: it = funnyEq True 'b'

Let’s put the two together now. Let’s throw away the two instances we defined above, and now look at this one:

instance (f ~ Maybe) => Functor (WrappedFunctor f) where fmap f (WrappedFunctor fa) = WrappedFunctor (fmap f fa)

This instance means “whenever you use `WrappedFunctor f`

as a functor, *regardless of what f is and even if we don’t know what it is yet*, then you can use this instance”, and a type error will be thrown if

`f`

cannot be instantiated to `Maybe`

. This is different to the instance `Functor (WrappedFunctor Maybe)`

!If we have

`instance Functor (WrappedFunctor Maybe)`

:> :t fmap (+1) (WrappedFunctor (pure 3)) :: (Num b, Applicative f, Functor (WrappedFunctor f)) => WrappedFunctor f b

If we have

`instance (f ~ Maybe) => Functor (WrappedFunctor f)`

:> :t fmap (+1) (WrappedFunctor (pure 3)) :: Num b => WrappedFunctor Maybe b

With the latter, we get much better type inference. The downside is that this instance overlaps any more concrete instances, so we couldn’t (for example) define an instance for `WrappedFunctor Identity`

as well.

But if you only need one instance, it’s a neat trick.

Here’s a concrete example from the dejafu-2.0.0.0 branch. I’ve introduced a `Program`

type, to model concurrent programs. There’s one sort of `Program`

, a `Program Basic`

, which can be used as a concurrency monad (a `MonadConc`

) directly. The instances are defined like so:

instance (pty ~ Basic, MonadIO n) => MonadIO (Program pty n) where -- ... instance (pty ~ Basic) => MonadTrans (Program pty) where -- ... instance (pty ~ Basic) => MonadCatch (Program pty n) where -- ... instance (pty ~ Basic) => MonadThrow (Program pty n) where -- ... instance (pty ~ Basic) => MonadMask (Program pty n) where -- ... instance (pty ~ Basic, Monad n) => MonadConc (Program pty n) where -- ...

If instead the instances has been defined for `Program Basic n`

, then the type checker would have complained that the `pty`

parameter is (in many cases) polymorphic, and not use these instances. This means every single use of a `Program pty n`

, where `pty`

was not otherwise constrained, would need a type annotation. By instead formulating the instances this way, the type checker *knows* that if you use a `Program pty n`

as a `MonadConc`

, then it must be a `Program Basic n`

.

This has turned a potentially huge breaking change, requiring everyone who uses dejafu to add type annotations to their tests, into something which just works.

]]>This got me thinking about *market values*. If I want to see the current market value of all my assets, I need to convert them all to the same currency, using a recent exchange rate. So I now have a script to fetch, once a day, exchange rates between £ and everything else:

P 2018-05-30 BTC £5501.58 P 2018-05-30 ETH £413.01 P 2018-05-30 LTC £87.85 P 2018-05-30 EUR £0.8775 P 2018-05-30 JPY £0.0069 P 2018-05-30 USD £0.7531 P 2018-05-30 VANEA £210.24

My script exports market values to influxdb, so I can see how the market value of my assets (in £) has changed over time. Great!

But what if I want to see the market value in a currency other than £? Like USD, for instance? The problem is that I have all these exchange rates:

But I don’t have, say, the exchange rate from EUR to USD.

Well it turns out that the reflexive-symmetric-transitive closure of that graph is just the thing I want! It looks pretty nasty with 7 currencies, so here it is with just 3:

Let’s see how to calculate those `?`

s.

I could pull in a functional graph library, but the graphs I’m dealing with are so small that I may as well just implement the few operations I need myself.

A graph is essentially a function `node -> node -> Maybe label`

:

import Data.Map (Map) import qualified Data.Map as M type Graph node label = Map node (Map node label)

We need an empty graph and, given a graph, we need to be able to add nodes and edges. As our nodes are the keys in the map, they need to be orderable.

-- | A graph with no nodes or edges. empty :: Ord n => Graph n l empty = M.empty -- | Add a node to a graph. addNode :: Ord n => n -> Graph n l -> Graph n l addNode n = M.insertWith (\_ old -> old) n M.empty

We don’t allow duplicate edges, as that means we have two exchange rates between the same pair of currencies, which doesn’t make much sense. So adding edges is a little more involved, as the edge might already exist:

-- | Add an edge to a graph, combining edges if they exist. -- -- If the source node doesn't exist, does not change the graph. addEdge :: Ord n => (l -> l -> l) -- ^ Function to combine edge labels. -> n -- ^ Source node. -> n -- ^ Target node. -> l -- ^ New label. -> Graph n l -> Graph n l addEdge combine from to label graph = case M.lookup from graph of Just edges -> let edges' = M.insertWith combine to label edges in M.insert from edges' graph Nothing -> graph

Ok, so we can represent our currency graph. Now we need to compute the reflexive-symmetric-transitive closure.

Reflexivity lets us go from a currency to itself:

-- | Take the reflexive closure by adding edges with the given label -- where missing. reflexiveClosure :: Ord n => l -> Graph n l -> Graph n l reflexiveClosure label graph = foldr (.) id [ addEdge (\_ old -> old) nA nA label | nA <- M.keys graph ] graph

If we know a exchange rate from A to B, symmetry gives us an exchange rate from B to A:

-- | Take the symmetric closure by adding new edges, transforming -- existing labels. symmetricClosure :: Ord n => (l -> l) -> Graph n l -> Graph n l symmetricClosure mk graph = foldr (.) id [ addEdge (\_ old -> old) nB nA (mk lAB) | (nA, edges) <- M.assocs graph , (nB, lAB) <- M.assocs edges ] graph

If we know an exchange rate from A to B, and from B to C, transitivity gives us an exchange rate from A to C:

-- | Take the transitive closure by adding new edges, combining -- existing labels. transitiveClosure :: (Ord n, Eq l) => (l -> l -> l) -> Graph n l -> Graph n l transitiveClosure combine = fixEq step where fixEq f = find . iterate f where find (a1:a2:as) | a1 == a2 = a1 | otherwise = find (a2:as) step graph = foldr (.) id [ addEdge (\_ old -> old) nA nC (combine lAB lBC) | (nA, edges) <- M.assocs graph , (nB, lAB) <- M.assocs edges , (nC, lBC) <- M.assocs (M.findWithDefault M.empty nB graph) ] graph

Exchange rates have three properties which we can make use of:

Any currency has an exchange rate with itself of 1.

If we have an exchange rate of

`x`

from A to B, then the rate from B to A is`1/x`

.If we have an exchange rate of

`x`

from A to B, and an exchange rate of`y`

from B to C, then the rate from A to C is`x*y`

.

So, given our graph of exchange rates, we can fill in the blanks like so:

-- | Fill in the blanks in an exchange rate graph. completeRates :: (Ord n, Eq l, Fractional l) => Graph n l -> Graph n l completeRates = transitiveClosure (*) . symmetricClosure (1/) . reflexiveClosure 1

There’s also a fourth property we can assume in reality:

- Any two paths between the same two currencies work out to the same exchange rate.

Otherwise we could make a profit by going around in a circle, and I’m sure someone would have noticed that already and made a lot of money. In our implementation however, we can’t assume that. Exchange rates available online have limited precision, and rounding errors will introduce more problems. But in general things will be close, so it doesn’t matter too much from the perspective of getting a rough idea of our personal finances.

So now I can look at my total assets in yen and feel like a millionaire:

]]>It’s also something that dejafu does not do.

I’m going to use the “stores are transitively visible” litmus test as a running example. Here it is:

import qualified Control.Monad.Conc.Class as C import Test.DejaFu.Internal import Test.DejaFu.SCT import Test.DejaFu.SCT.Internal.DPOR import Test.DejaFu.Types import Test.DejaFu.Utils storesAreTransitivelyVisible :: C.MonadConc m => m (Int, Int, Int) storesAreTransitivelyVisible = do x <- C.newCRef 0 y <- C.newCRef 0 j1 <- C.spawn (C.writeCRef x 1) j2 <- C.spawn (do r1 <- C.readCRef x; C.writeCRef x 1; pure r1) j3 <- C.spawn (do r2 <- C.readCRef y; r3 <- C.readCRef x; pure (r2,r3)) (\() r1 (r2,r3) -> (r1,r2,r3)) <$> C.readMVar j1 <*> C.readMVar j2 <*> C.readMVar j3

I picked this one because it’s kind of arbitrarily complex. It’s a small test, but it’s for the relaxed memory implementation, so there’s a lot going on. It’s a fairly dense test.

I’m now going to define a metric of trace complexity which I’ll justify in a moment:

complexity :: Trace -> (Int, Int, Int, Int) complexity = foldr go (0,0,0,0) where go (SwitchTo _, _, CommitCRef _ _) (w, x, y, z) = (w+1, x+1, y, z) go (Start _, _, CommitCRef _ _) (w, x, y, z) = (w+1, x, y+1, z) go (Continue, _, CommitCRef _ _) (w, x, y, z) = (w+1, x, y, z+1) go (SwitchTo _, _, _) (w, x, y, z) = (w, x+1, y, z) go (Start _, _, _) (w, x, y, z) = (w, x, y+1, z) go (Continue, _, _) (w, x, y, z) = (w, x, y, z+1)

Using the `183-shrinking`

branch, we can now get the first trace for every distinct result, along with its complexity:

results :: Way -> MemType -> IO () results way memtype = do let settings = set lequality (Just (==)) $ fromWayAndMemType way memtype res <- runSCTWithSettings settings storesAreTransitivelyVisible flip mapM_ res $ \(efa, trace) -> putStrLn (show efa ++ "\t" ++ showTrace trace ++ "\t" ++ show (complexity trace))

Here are the results for systematic testing:

λ> results (systematically defaultBounds) SequentialConsistency Right (1,0,1) S0------------S1---S0--S2-----S0--S3-----S0-- (0,0,7,24) Right (0,0,1) S0------------S2-----S1---S0---S3-----S0-- (0,0,6,24) Right (0,0,0) S0------------S2-P3-----S1---S0--S2----S0--- (0,1,6,23) Right (1,0,0) S0------------S3-----S1---S0--S2-----S0--- (0,0,6,24) λ> results (systematically defaultBounds) TotalStoreOrder Right (1,0,1) S0------------S1---S0--S2-----S0--S3-----S0-- (0,0,7,24) Right (0,0,1) S0------------S1-P2-----S1--S0---S3-----S0-- (0,1,6,23) Right (0,0,0) S0------------S1-P2---P3-----S1--S0--S2--S0--- (0,2,6,22) Right (1,0,0) S0------------S1-P3-----S1--S0--S2-----S0--- (0,1,6,23) λ> results (systematically defaultBounds) PartialStoreOrder Right (1,0,1) S0------------S1---S0--S2-----S0--S3-----S0-- (0,0,7,24) Right (0,0,1) S0------------S1-P2-----S1--S0---S3-----S0-- (0,1,6,23) Right (0,0,0) S0------------S1-P2---P3-----S1--S0--S2--S0--- (0,2,6,22) Right (1,0,0) S0------------S1-P3-----S1--S0--S2-----S0--- (0,1,6,23)

Pretty messy, right? Here’s the results for *random* testing:

λ> results (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0-----P1-P0----P2-P1-P0-P3-P1-S2-P3--P0-P3-P0-P3-S2-P0-S2-P0--P2-S0- (0,15,5,9) Right (0,0,1) S0-------P2-P1-P2-P0--P2-P0-P1-P0---S2-P3-P0-P2-S3---P1-S3-S0-- (0,12,5,12) Right (1,0,0) S0------------S3-----S1-P2-P1-P0--S2---P1-S0--- (0,4,5,20) Right (0,0,0) S0---------P2-P0--P3-P0-S3--P2-P3-P2--P3-S2-S1--P0---- (0,9,4,15) λ> results (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0-----P1--P0-P1-S0-P2--C-S0---P2-P3-P2--S3-P0-P3-P0---S3-P0-P3-S0- (1,13,6,11) Right (0,0,1) S0----P1-P0-----P2--P0--P2-P0-S2--S3-P1-P0---S1-S3----S0-- (0,8,6,16) Right (0,0,0) S0--------P2-P0--P3-P2-P0-P3-P2-C-S0-S3---S2--S1-C-S1-P0---- (2,10,6,14) λ> results (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0-----P1--P0-P1-S0-P2--C-S0---P2-P3-P2--S3-P0-P3-P0---S3-P0-P3-S0- (1,13,6,11) Right (0,0,1) S0----P1-P0-----P2--P0--P2-P0-S2--S3-P1-P0---S1-S3----S0-- (0,8,6,16) Right (0,0,0) S0--------P2-P0--P3-P2-P0-P3-P2-C-S0-S3---S2--S1-C-S1-P0---- (2,10,6,14)

Yikes!

The complexity metric I defined counts four things:

- The number of relaxed-memory commit actions
- The number of pre-emptive context switches
- The number of non-pre-emptive context switches
- The number of continues

I would much rather read a long trace where the only context switches are when threads block, than a short one which is rapidly jumping between threads. So, given two equivalent traces, I will always prefer the one with a lexicographically smaller complexity-tuple.

The key idea underpinning trace simplification is that dejafu can tell when two scheduling decisions can be swapped without changing the behaviour of the program. I talked about this idea in the Using Hedgehog to Test Déjà Fu memo. So we can implement transformations which are guaranteed to preserve semantics *without needing to verify this by re-running the test case*.

Although we don’t need to re-run the test case at all, the `183-shrinking`

branch currently does, but only once at the end after the minimum has been found. This is because it’s easier to generate a simpler sequence of scheduling decisions and use dejafu to produce the corresponding trace than it is to produce a simpler trace directly. This is still strictly better than a typical shrinking algorithm, which would re-run the test case after *each* shrinking step, rather than only at the end.

Rather than drag this out, here’s what those random traces simplify to:

resultsS :: Way -> MemType -> IO () resultsS way memtype = do let settings = set lsimplify True . set lequality (Just (==)) $ fromWayAndMemType way memtype res <- runSCTWithSettings settings storesAreTransitivelyVisible flip mapM_ res $ \(efa, trace) -> putStrLn (show efa ++ "\t" ++ showTrace trace ++ "\t" ++ show (complexity trace))

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0----------P1---S2--P3-----S0---S2---S0--- (0,2,5,22) Right (0,0,1) S0----------P2-P1-P2-P1--S0---S2---S3-----S0--- (0,4,5,20) Right (1,0,0) S0------------S3-----S1---S0--S2----P0--- (0,1,5,23) Right (0,0,0) S0------------S3--P2-----S3---S1--P0---- (0,2,4,22) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0----------P1---S2-----S0----S3-----S0-- (0,1,5,23) Right (0,0,1) S0----------P1-P2-----S0--S1--S0---S3-----S0-- (0,2,6,22) Right (0,0,0) S0----------P2--P3-----S0--S2---S1--P0---- (0,3,4,21) λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0----------P1---S2-----S0----S3-----S0-- (0,1,5,23) Right (0,0,1) S0----------P1-P2-----S0--S1--S0---S3-----S0-- (0,2,6,22) Right (0,0,0) S0----------P2--P3-----S0--S2---S1--P0---- (0,3,4,21)

This is much better.

There are two simplification phases: a preparation phase, which puts the trace into a normal form and prunes unnecessary commits; and an iteration phase, which repeats a step function until a fixed point is reached (or the iteration limit is).

The preparation phase has two steps: first we put the trace into *lexicographic normal form*, then we prune unnecessary commits.

We put a trace in lexicographic normal form by sorting by thread ID, where only independent actions can be swapped:

lexicoNormalForm :: MemType -> [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] lexicoNormalForm memtype = go where go trc = let trc' = bubble initialDepState trc in if trc == trc' then trc else go trc' bubble ds (t1@(tid1, ta1):t2@(tid2, ta2):trc) | independent ds tid1 ta1 tid2 ta2 && tid2 < tid1 = bgo ds t2 (t1 : trc) | otherwise = bgo ds t1 (t2 : trc) bubble _ trc = trc bgo ds t@(tid, ta) trc = t : bubble (updateDepState memtype ds tid ta) trc

If simplification only put traces into lexicographic normal form, we would get these results:

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0-----------P1---S2--P0--S2--P0-P3----P0-- (0,5,3,19) Right (0,0,1) S0-----------P2-P1-P2-P1-P0--S2--P0-P1-S2-S3----P0-- (0,8,4,16) Right (1,0,0) S0------------S3----P1--P0--S1-S2----P0--- (0,3,4,21) Right (0,0,0) S0------------S2-P3--P2----S3--P1--P0---- (0,4,3,20) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0-------P1---S2--C-S0-----P2--P0--S2-S3----P0-- (1,5,5,19) Right (0,0,1) S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0-- (0,7,5,17) Right (0,0,0) S0-----------P2---P3--C-S0-S2--S3--P1-C-S1-P0---- (2,6,5,18) λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0-------P1---S2--C-S0-----P2--P0--S2-S3----P0-- (1,5,5,19) Right (0,0,1) S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0-- (0,7,5,17) Right (0,0,0) S0-----------P2---P3--C-S0-S2--S3--P1-C-S1-P0---- (2,6,5,18)

These are better than they were, but we can do better still.

After putting the trace into lexicographic normal form, we delete any commit actions which are followed by any number of independent actions and then a memory barrier:

dropCommits :: MemType -> [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] dropCommits SequentialConsistency = id dropCommits memtype = go initialDepState where go ds (t1@(tid1, ta1@(CommitCRef _ _)):t2@(tid2, ta2):trc) | isBarrier (simplifyAction ta2) = go ds (t2:trc) | independent ds tid1 ta1 tid2 ta2 = t2 : go (updateDepState memtype ds tid2 ta2) (t1:trc) go ds (t@(tid,ta):trc) = t : go (updateDepState memtype ds tid ta) trc go _ [] = []

Such commits don’t affect the behaviour of the program at all, as all buffered writes gets flushed when the memory barrier happens.

If simplification only did the preparation phase, we would get these results:

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0-----------P1---S2--P0--S2--P0-P3----P0-- (0,5,3,19) Right (0,0,1) S0-----------P2-P1-P2-P1-P0--S2--P0-P1-S2-S3----P0-- (0,8,4,16) Right (1,0,0) S0------------S3----P1--P0--S1-S2----P0--- (0,3,4,21) Right (0,0,0) S0------------S2-P3--P2----S3--P1--P0---- (0,4,3,20) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0-------P1---S2--P0-----P2--P0--S2-S3----P0-- (0,5,4,19) ^-- better than just lexicoNormalForm Right (0,0,1) S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0-- (0,7,5,17) Right (0,0,0) S0-----------P2---P3--P0-S2--S3--P1--P0---- (0,5,3,19) ^-- better than just lexicoNormalForm λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0-------P1---S2--P0-----P2--P0--S2-S3----P0-- (0,5,4,19) ^-- better than just lexicoNormalForm Right (0,0,1) S0-----------P1-P2--P0-S1-P0-P2--P0--S1-S2-S3----P0-- (0,7,5,17) Right (0,0,0) S0-----------P2---P3--P0-S2--S3--P1--P0---- (0,5,3,19) ^-- better than just lexicoNormalForm

The iteration phase attempts to reduce context switching by pushing actions forwards, or pulling them backwards, through the trace.

If we have the trace `[(tid1, act1), (tid2, act2), (tid1, act3)]`

, where `act2`

and `act3`

are independent, the “pull back” transformation would re-order that to `[(tid1, act1), (tid1, act3), (tid2, act2)]`

.

In contrast, if `act1`

and `act2`

were independent, the “push forward” transformation would re-order that to `[(tid2, act2), (tid1, act1), (tid1, act3)]`

. The two transformations are almost, but not quite opposites.

Pull-back walks through the trace and, at every context switch, looks forward to see if there is a single action of the original thread it can put before the context switch:

pullBack :: MemType -> [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] pullBack memtype = go initialDepState where go ds (t1@(tid1, ta1):trc@((tid2, _):_)) = let ds' = updateDepState memtype ds tid1 ta1 trc' = if tid1 /= tid2 then maybe trc (uncurry (:)) (findAction tid1 ds' trc) else trc in t1 : go ds' trc' go _ trc = trc findAction tid0 = fgo where fgo ds (t@(tid, ta):trc) | tid == tid0 = Just (t, trc) | otherwise = case fgo (updateDepState memtype ds tid ta) trc of Just (ft@(ftid, fa), trc') | independent ds tid ta ftid fa -> Just (ft, t:trc') _ -> Nothing fgo _ _ = Nothing

Push-forward walks through the trace and, at every context switch, looks forward to see if the last action of the original thread can be put at its next execution:

pushForward :: MemType -> [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] pushForward memtype = go initialDepState where go ds (t1@(tid1, ta1):trc@((tid2, _):_)) = let ds' = updateDepState memtype ds tid1 ta1 in if tid1 /= tid2 then maybe (t1 : go ds' trc) (go ds) (findAction tid1 ta1 ds trc) else t1 : go ds' trc go _ trc = trc findAction tid0 ta0 = fgo where fgo ds (t@(tid, ta):trc) | tid == tid0 = Just ((tid0, ta0) : t : trc) | independent ds tid0 ta0 tid ta = (t:) <$> fgo (updateDepState memtype ds tid ta) trc | otherwise = Nothing fgo _ _ = Nothing

The iteration process just repeats `pushForward memtype . pullBack memtype`

.

If it only used `pullBack`

, we would get these results:

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0-----------P1---S2---P0--S2--S0-P3-----S0-- (0,3,5,21) Right (0,0,1) S0-----------P2-P1-P2--P1--S0--S2--S0-P3-----S0-- (0,5,5,19) Right (1,0,0) S0------------S3-----S1---S0--S2----P0--- (0,1,5,23) Right (0,0,0) S0------------S2-P3---P2----S3--S1--P0---- (0,3,4,21) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0-----------P1---S2-----S0---S3-----S0-- (0,1,5,23) Right (0,0,1) S0-----------P1-P2-----S0-S1--S0---S3-----S0-- (0,2,6,22) Right (0,0,0) S0-----------P2---P3-----S0-S2--S1--P0---- (0,3,4,21) λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0-----------P1---S2-----S0---S3-----S0-- (0,1,5,23) Right (0,0,1) S0-----------P1-P2-----S0-S1--S0---S3-----S0-- (0,2,6,22) Right (0,0,0) S0-----------P2---P3-----S0-S2--S1--P0---- (0,3,4,21)

With no exception, iterating `pullBack`

is an improvement over just doing preparation.

If it only used `pushForward`

, we would get these results:

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0-------P1---S2--P0------S2--P3----P0--- (0,4,3,20) Right (0,0,1) S0-------P2-P1-P2-P1-P0------S1-S2---S3----P0--- (0,6,4,18) Right (1,0,0) S0------------S3----P1--P0--S1-S2----P0--- (0,3,4,21) ^-- no improvement over preparation Right (0,0,0) S0------------S3--P2-----S3--P1--P0---- (0,3,3,21) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0----P1---S0---P2----P0-------S2-S3----P0-- (0,4,4,20) Right (0,0,1) S0-------P1-P2--P0-----S1-P2--P0---S1-S2-S3----P0-- (0,6,5,18) Right (0,0,0) S0----------P2--P3--P0--S2---S3--P1--P0---- (0,5,3,19) ^-- no improvement over preparation λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0----P1---S0---P2----P0-------S2-S3----P0-- (0,4,4,20) Right (0,0,1) S0-------P1-P2--P0-----S1-P2--P0---S1-S2-S3----P0-- (0,6,5,18) Right (0,0,0) S0----------P2--P3--P0--S2---S3--P1--P0---- (0,5,3,19) ^-- no improvement over preparation

With three exceptions, where the traces didn’t change, iterating `pushForward`

is an improvement over just doing preparation.

We’ve already seen the results if we combine them:

λ> resultsS (randomly (mkStdGen 0) 100) SequentialConsistency Right (1,0,1) S0----------P1---S2--P3-----S0---S2---S0--- (0,2,5,22) Right (0,0,1) S0----------P2-P1-P2-P1--S0---S2---S3-----S0--- (0,4,5,20) Right (1,0,0) S0------------S3-----S1---S0--S2----P0--- (0,1,5,23) ^-- same as pullBack, which is better than pushForward Right (0,0,0) S0------------S3--P2-----S3---S1--P0---- (0,2,4,22) λ> resultsS (randomly (mkStdGen 0) 100) TotalStoreOrder Right (1,0,1) S0----------P1---S2-----S0----S3-----S0-- (0,1,5,23) ^-- same as pullBack, which is better than pushForward Right (0,0,1) S0----------P1-P2-----S0--S1--S0---S3-----S0-- (0,2,6,22) ^-- same as pullBack, which is better than pushForward Right (0,0,0) S0----------P2--P3-----S0--S2---S1--P0---- (0,3,4,21) λ> resultsS (randomly (mkStdGen 0) 100) PartialStoreOrder Right (1,0,1) S0----------P1---S2-----S0----S3-----S0-- (0,1,5,23) ^-- same as pullBack, which is better than pushForward Right (0,0,1) S0----------P1-P2-----S0--S1--S0---S3-----S0-- (0,2,6,22) ^-- same as pullBack, which is better than pushForward Right (0,0,0) S0----------P2--P3-----S0--S2---S1--P0---- (0,3,4,21)

I think what I have right now is pretty good. It’s definitely a vast improvement over not doing any simplification.

*But*, no random traces get simplified to the corresponding systematic traces, which is a little disappointing. I think that’s because the current passes just try to reduce context switches of any form, whereas really I want to reduce pre-emptive context switches more than non-pre-emptive ones.

Trying every possible interleaving will give you, in general, an exponential blow-up of the executions you need to perform as your test case grows in size. The core testing algorithm we use, a variant of dynamic partial-order reduction (DPOR)00For all the gory details, see:

, attempts to reduce this blow-up. DPOR identifies actions which are *dependent*, and only tries interleavings which permute dependent actions.

Here are some examples:

It doesn’t matter which order two threads execute

`readMVar`

, for the same`MVar`

. These actions are*independent*.It does matter which order two threads execute

`putMVar`

, for the same`MVar`

. These actions are*dependent*.It doesn’t matter which order two threads execute

`putMVar`

for different`MVar`

s. These actions are*independent*.

Two actions are dependent if the order in which they are performed matters.

So the intuition behind DPOR is that most actions in a concurrent program are *independent*. DPOR won’t help you much if you have a single piece of shared state which every thread is hitting, but most concurrent programs aren’t like that. The worst case is still a terrible exponential blow-up, but the average case is much better.

The dependency relation is *core* part of Déjà Fu today. It has impacts on both performance and correctness. If it says two actions are dependent when they are not, then we may see unnecessary interleavings tried. If it says two actions are not dependent when they really are, then we may miss necessary interleavings.

Being such an important component, it must be well-tested, right? Well, sort of. The Déjà Fu testsuite mostly consists of small concurrent programs together with a list of expected outputs, testing that Déjà Fu finds all the nondeterminism in the program. This does exercise the dependency relation, but only very indirectly.

There things would have remained had I not experienced one of those coincidence-driven flashes of insight:

aherrmann opened an issue on GitHub asking how to take an execution trace and replay it.

agnishom posted a thread on /r/algorithms asking how to check the equivalence of traces where only some elements commute.

I had my idea. I can *directly* test the dependency relation like so:

- Execute a concurrent program.
- Normalise its execution trace in some way.
- “Replay” the normalised trace.
- Assert that the result is the same.

So, what is a good normal form for a trace? I tried out a few approaches here, but there was one I kept coming back to: we should shuffle around independent actions to keep the program on the main thread for as long as possible.

There are two reasons I think this works well. (1) The traces we get will be easier for a human to read, as the program will stay on its main thread and only execute another thread where necessary. (2) A Haskell program terminates when the main thread terminates, so by executing the main thread as much as possible, we may find that some actions don’t need to be executed at all.

So firstly we need to know when two actions commute. Let’s just use the dependency relation for that:

-- | Check if two actions commute. independent :: DepState -> (ThreadId, ThreadAction) -> (ThreadId, ThreadAction) -> Bool independent ds (tid1, ta1) (tid2, ta2) = not (dependent ds tid1 ta1 tid2 ta2)

The `DepState`

parameter tracks information about the history of the execution, allowing us to make better decisions. For example: while in general it matters in which order two `putMVar`

s to the same `MVar`

happen; it *doesn’t* matter if the `MVar`

is already full, as both actions will block without achieving anything.

The approach works well in practice, but has been the source of *so many* off-by-one errors. Even while writing this memo!

So now onto trace normalisation. The easiest way to do it is bubble sort, but with an additional constraint on when we can swap things:

- For every adjacent pair of items
`x`

and`y`

in the trace:- If
`x`

and`y`

commute and`thread_id y < thread_id x`

:- Swap
`x`

and`y`

.

- Swap
- Update the
`DepState`

and continue to the next pair.

- If
- Repeat until there are no more changes.

And here’s the code:

-- | Rewrite a trace into a canonical form. normalise :: [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] normalise trc0 = if changed then normalise trc' else trc' where (changed, trc') = bubble initialDepState False trc0 bubble ds flag ((x@(tid1, _)):(y@(tid2, _)):trc) | independent ds x y && tid2 < tid1 = go ds True y (x : trc) | otherwise = go ds flag x (y : trc) bubble _ flag trc = (flag, trc) go ds flag t@(tid, ta) trc = second (t :) (bubble (updateDepState ds tid ta) flag trc)

Now we need a scheduler which can play a given list of scheduling decisions. This isn’t built in, but we can make one. Schedulers look like this:

-- from Test.DejaFu.Schedule newtype Scheduler state = Scheduler { scheduleThread :: Maybe (ThreadId, ThreadAction) -> NonEmpty (ThreadId, Lookahead) -> state -> (Maybe ThreadId, state) }

A scheduler is a stateful function, which takes the previously scheduled action and the list of runnable threads, and gives back a thread to execute. We don’t care about those parameters. We just want to play a fixed list of scheduling decisions. And here is how we do that:

-- | Execute a concurrent program by playing a list of scheduling decisions. play :: MemType -> [ThreadId] -> ConcIO a -> IO (Either Failure a, [ThreadId], Trace) play = runConcurrent (Scheduler sched) where sched _ _ (t:ts) = (Just t, ts) sched _ _ [] = (Nothing, [])

Now all the background is in place, so we can test what we want to test: that an execution, and the play-back of its normalised trace, give the same result. For reasons which will become apparent in the next section, I’m going to parameterise over the normalisation function:

-- | Execute a concurrent program with a random scheduler, normalise its trace, -- execute the normalised trace, and return both results. runNorm :: ([(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)]) -> Int -> MemType -> ConcIO a -> IO (Either Failure a, [ThreadId], Either Failure a, [ThreadId]) runNorm norm seed memtype conc = do let g = mkStdGen seed -- 1 (efa1, _, trc) <- runConcurrent randomSched memtype g conc let -- 2 trc' = tail ( scanl (\(t, _) (d, _, a) -> (tidOf t d, a)) (initialThread, undefined) trc ) let tids1 = map fst trc' let tids2 = map fst (norm trc') -- 3 (efa2, s, _) <- play memtype tids2 conc let truncated = take (length tids2 - length s) tids2 -- 4 pure (efa1, tids1, efa2, truncated)

There’s a lot going on here, so let’s break it down:

We execute the program with the built-in random scheduler, using the provided seed.

The trace that

`runConcurrent`

gives us is in the form`[(Decision, [(ThreadId, Lookahead)], ThreadAction)]`

, whereas we want a`[(ThreadId, ThreadAction)]`

. So this scan just changes the format. It’s a scan rather than a map because to convert a`Decision`

into a`ThreadId`

potentially requires knowing what the previous thread was.We normalise the trace, and run it again.

If the entire normalised trace wasn’t used up, then it has some unnecessary suffix (because the main thread is now terminating sooner). So we make the normalised trace easier to read by chopping off any such suffix.

Finally, we can write a little function to test using the `normalise`

function:

-- | Execute a concurrent program with a random scheduler, normalise its trace, -- execute the normalised trace, and check that both give the same result. testNormalise :: (Eq a, Show a) => Int -> MemType -> ConcIO a -> IO Bool testNormalise seed memtype conc = do (efa1, tids1, efa2, tids2) <- runNorm normalise seed memtype conc unless (efa1 == efa2) $ do putStrLn "Mismatched result!" putStrLn $ " expected: " ++ show efa1 putStrLn $ " but got: " ++ show efa2 putStrLn "" putStrLn $ "rewritten from: " ++ show tids1 putStrLn $ " to: " ++ show tids2 pure (efa1 == efa2)

And does it work? Let’s copy two example programs from the Test.DejaFu docs:

-- from Test.DejaFu example1 :: MonadConc m => m String example1 = do var <- newEmptyMVar fork (putMVar var "hello") fork (putMVar var "world") readMVar var example2 :: MonadConc m => m (Bool, Bool) example2 = do r1 <- newCRef False r2 <- newCRef False x <- spawn $ writeCRef r1 True >> readCRef r2 y <- spawn $ writeCRef r2 True >> readCRef r1 (,) <$> readMVar x <*> readMVar y

And then test them:

> testNormalise 0 TotalStoreOrder example1 True > testNormalise 0 TotalStoreOrder example2 True

According to my very unscientific method, everything works perfectly!

You can probably see where this is going: just supplying *one* random seed and *one* memory model is a poor way to test things. Ah, if only we had some sort of tool to generate arbitrary values for us!

But that’s not all: if the dependency relation is correct, then *any* permutation of independent actions should give the same result, not just the one which `normalise`

implements. So before we introduce Hedgehog and arbitrary values, let’s make something a little more chaotic:

-- | Shuffle independent actions in a trace according to the given list. shuffle :: [Bool] -> [(ThreadId, ThreadAction)] -> [(ThreadId, ThreadAction)] shuffle = go initialDepState where go ds (f:fs) (t1:t2:trc) | independent ds t1 t2 && f = go' ds fs t2 (t1 : trc) | otherwise = go' ds fs t1 (t2 : trc) go _ _ trc = trc go' ds fs t@(tid, ta) trc = t : go (updateDepState ds tid ta) fs trc

In `normalise`

, two independent actions will *always* be re-ordered if it gets us closer to the canonical form. However, in `shuffle`

, two independent actions will either be re-ordered or not, depending on the supplied list of `Bool`

.

This is much better for testing our dependency relation, as we can now get far more re-orderings which *all* should satisfy the same property: that no matter how the independent actions in a trace are shuffled, we get the same result.

I think it’s about time to bring out Hedgehog:

-- | Execute a concurrent program with a random scheduler, arbitrarily permute -- the independent actions in the trace, and check that we get the same result -- out. hog :: (Eq a, Show a) => ConcIO a -> IO Bool hog conc = Hedgehog.check . property $ do mem <- forAll Gen.enumBounded -- 1 seed <- forAll $ Gen.int (Range.linear 0 100) fs <- forAll $ Gen.list (Range.linear 0 100) Gen.bool (efa1, tids1, efa2, tids2) <- liftIO -- 2 $ runNorm (shuffle fs) seed mem conc footnote (" to: " ++ show tids2) -- 3 footnote ("rewritten from: " ++ show tids1) efa1 === efa2

Let’s break that down:

We’re telling Hedgehog that this property should hold for all memory models, all seeds, and all

`Bool`

-lists. Unlike most Haskell property-testing libraries, Hedgehog takes generator functions rather than using a typeclass. I think this is nicer.We run our program, normalise it, and get all the results just as before.

We add some footnotes: messages which Hedgehog will display along with a failure. For some reason these get displayed in reverse order.

Alright, let’s see if Hedgehog finds any bugs for us:

> hog example1 ? <interactive> failed after 3 tests and 1 shrink. ??? extra.hs ??? 82 ? hog :: (Eq a, Show a) => ConcIO a -> IO Bool 83 ? hog conc = Hedgehog.check . property $ do 84 ? mem <- forAll Gen.enumBounded ? ? SequentialConsistency 85 ? seed <- forAll $ Gen.int (Range.linear 0 100) ? ? 0 86 ? fs <- forAll $ Gen.list (Range.linear 0 100) Gen.bool ? ? [ False , True ] 87 ? 88 ? (efa1, tids1, efa2, tids2) <- liftIO 89 ? $ runNorm (shuffle fs) seed mem conc 90 ? footnote (" to: " ++ show tids2) 91 ? footnote ("rewritten from: " ++ show tids1) 92 ? efa1 === efa2 ? ^^^^^^^^^^^^^ ? ? Failed (- lhs =/= + rhs) ? ? - Right "hello" ? ? + Left InternalError rewritten from: [main,main,1,main,1,2,main,2,main] to: [main,1] This failure can be reproduced by running: > recheck (Size 2) (Seed 1824012233418733250 (-4876494268681827407)) <property> False

It did! And look at that output! Magical! I must see if I can get Déjà Fu to give annotated source output like that.

Let’s look at `example1`

again:

do var <- newEmptyMVar fork (putMVar var "hello") fork (putMVar var "world") readMVar var

Oh dear, our rewritten trace is trying to execute thread `1`

immediately after the first action of the main thread. The first action of the main thread is `newEmptyMVar`

: thread `1`

doesn’t exist at that point!

Let’s change our `independent`

function to say that an action is dependent with the fork which creates its thread:

independent ds (tid1, ta1) (tid2, ta2) | ta1 == Fork tid2 = False | ta2 == Fork tid1 = False | otherwise = not (dependent ds tid1 ta1 tid2 ta2)

How about now?

> hog example1 ? <interactive> failed after 13 tests and 2 shrinks. ??? extra.hs ??? 82 ? hog :: (Eq a, Show a) => ConcIO a -> IO Bool 83 ? hog conc = Hedgehog.check . property $ do 84 ? mem <- forAll Gen.enumBounded ? ? SequentialConsistency 85 ? seed <- forAll $ Gen.int (Range.linear 0 100) ? ? 0 86 ? fs <- forAll $ Gen.list (Range.linear 0 100) Gen.bool ? ? [ True , True ] 87 ? 88 ? (efa1, tids1, efa2, tids2) <- liftIO 89 ? $ runNorm (shuffle fs) seed mem conc 90 ? footnote (" to: " ++ show tids2) 91 ? footnote ("rewritten from: " ++ show tids1) 92 ? efa1 === efa2 ? ^^^^^^^^^^^^^ ? ? Failed (- lhs =/= + rhs) ? ? - Right "hello" ? ? + Left InternalError rewritten from: [main,main,1,main,1,2,main,2,main] to: [main,1] This failure can be reproduced by running: > recheck (Size 12) (Seed 654387260079025817 (-6686572164463137223)) <property> False

Well, that failing trace looks exactly like the previous error. But the parameters are different: the first error happened with the list `[False, True]`

, this requires the list `[True, True]`

. So let’s think about what happens to the trace in this case.

We start with:

`[(main, NewEmptyMVar 0), (main, Fork 1), (1, PutMVar 0)]`

.The first two actions are independent, and the flag is

`True`

, so we swap them. We now have:`[(main, Fork 1), (main, NewEmptyMVar 1), (1, PutMVar 0)]`

.The second two actions are independent, and the flag is

`True`

, so we swap them. We now have:`[(main, Fork 1), (1, PutMVar 0), (main, NewEmptyMVar 0)]`

.

We can’t actually re-order actions of the same thread, so we should never have swapped the first two. I suppose there’s another problem here, that no action on an `MVar`

commutes with creating that `MVar`

, but we should never be in a situation where that could happen. So we need another case in `independent`

:

independent ds (tid1, ta1) (tid2, ta2) | tid1 == tid2 = False | ta1 == Fork tid2 = False | ta2 == Fork tid1 = False | otherwise = not (dependent ds tid1 ta1 tid2 ta2)

Our first example program works fine now:

> hog example1 ? <interactive> passed 100 tests. True

The second is a little less happy:

> hog example2 ? <interactive> failed after 48 tests and 9 shrinks. ??? extra.hs ??? 82 ? hog :: (Eq a, Show a) => ConcIO a -> IO Bool 83 ? hog conc = Hedgehog.check . property $ do 84 ? mem <- forAll Gen.enumBounded ? ? TotalStoreOrder 85 ? seed <- forAll $ Gen.int (Range.linear 0 100) ? ? 0 86 ? fs <- forAll $ Gen.list (Range.linear 0 100) Gen.bool ? ? [ False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , False ? ? , True ? ? ] 87 ? 88 ? (efa1, tids1, efa2, tids2) <- liftIO 89 ? $ runNorm (shuffle fs) seed mem conc 90 ? footnote (" to: " ++ show tids2) 91 ? footnote ("rewritten from: " ++ show tids1) 92 ? efa1 === efa2 ? ^^^^^^^^^^^^^ ? ? Failed (- lhs =/= + rhs) ? ? - Right ( False , True ) ? ? + Left InternalError rewritten from: [main,main,main,main,main,1,-1,1,1,main,1,main,main,main,main,2,-1,2,2,main,main] to: [main,main,main,main,main,1,-1,1,1,main,1,main,main,main,2,main,-1] This failure can be reproduced by running: > recheck (Size 47) (Seed 2159662051602767058 (-7857629802164753123)) <property> False

This is a little trickier. Here’s my diagnosis:

It’s an

`InternalError`

again, which means we’re trying to execute a thread which isn’t runnable.The memory model is

`TotalStoreOrder`

, and the thread we’re trying to execute is thread`-1`

, a “fake” thread used in the relaxed memory implementation. So this is a relaxed memory bug.The traces only differ in one place: where

`main, 2, -1`

is changed to`2, main, -1`

. So the issue is caused by re-ordering`main`

and thread`2`

.If the

`main`

action is a memory barrier, then thread`-1`

will not exist after it.So the

`main`

action is probably a memory barrier.

Let’s push along those lines and add a case for memory barriers to `independent`

:

independent ds (tid1, ta1) (tid2, ta2) | tid1 == tid2 = False | ta1 == Fork tid2 = False | ta2 == Fork tid1 = False | otherwise = case (simplifyAction ta1, simplifyAction ta2) of (UnsynchronisedWrite _, a) | isBarrier a -> False (a, UnsynchronisedWrite _) | isBarrier a -> False _ -> not (dependent ds tid1 ta1 tid2 ta2)

Did we get it?

> hog example2 ? <interactive> passed 100 tests. True

Great!

So, we explored the dependency relation with Hedgehog, and found three missing cases:

Two actions of the same thread are dependent.

Any action of a thread is dependent with the

`fork`

which creates that thread.Unsynchronised writes are dependent with memory barriers.

But are these *bugs*? I’m not so sure:

The dependency relation is only ever used to compare different threads.

This is technically correct, but it’s not interesting or useful.

This could be a bug. The relaxed memory implementation is pretty hairy and I’ve had a lot of problems with it in the past. Honestly, I just need to rewrite it (or campaign for Haskell to become sequentially consistent11

**SC-Haskell: Sequential Consistency in Languages That Minimize Mutable Shared Heap**, M. Vollmer, R. G. Scott, M. Musuvathi, and R. R. Newton (2017)

and rip it out).

But even if not bugs, these are definitely *confusing*. The dependency relation is currently just an internal thing, not exposed to users. However, I’m planning to expose a function to normalise traces, in which case providing an `independent`

function is entirely reasonable.

So even if these changes don’t make it into `dependent`

, they will be handled by `independent`

.

**Next steps:** I’m going to get this into the test suite, to get a large number of extra example programs for free. My hacky and cobbled-together testing framework in dejafu-tests is capable of running every test case with a variety of different schedulers, so I just need to add another way it runs everything. I won’t need to touch the actual tests, just the layer of glue which runs them all, which is nice.

The only problem is that this glue is currently based on HUnit and test-framework, whereas the only integration I can find for Hedgehog is tasty-hedgehog, so I might need to switch to tasty first. As usual, the hardest part is getting different libraries to co-operate!

Hopefully I’ll find some bugs! Well, not exactly *hopefully*, but you know what I mean.

There are 100 prisoners in solitary cells. There’s a central living room with one light bulb; this bulb is initially off. No prisoner can see the light bulb from his or her own cell. Everyday, the warden picks a prisoner equally at random, and that prisoner visits the living room. While there, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting that all 100 prisoners have been to the living room by now. If this assertion is false, all 100 prisoners are shot. However, if it is indeed true, all prisoners are set free and inducted into MENSA, since the world could always use more smart people. Thus, the assertion should only be made if the prisoner is 100% certain of its validity. The prisoners are allowed to get together one night in the courtyard, to discuss a plan. What plan should they agree on, so that eventually, someone will make a correct assertion?

We can express this as a concurrency problem: the warden is the scheduler, each prisoner is a thread, and when the program terminates every prisoner should have visited the living room.

Let’s set up some imports:

{-# LANGUAGE RankNTypes #-} import qualified Control.Concurrent.Classy as C import Control.Monad (forever, when) import Data.Foldable (for_) import Data.List (genericLength) import Data.Maybe (mapMaybe) import qualified Data.Set as S import qualified Test.DejaFu as D import qualified Test.DejaFu.Common as D import qualified Test.DejaFu.SCT as D

Before we try to implement a solution, let’s think about how we can check if an execution corresponds to the prisoners succeeding an entering MENSA, or failing and being shot.

Prisoners are threads, and the warden is the scheduler. So if every thread (prisoner) that is forked is scheduled (taken to the room), then the prisoners are successful:

-- | Check if an execution corresponds to a correct guess. isCorrect :: D.Trace -> Bool isCorrect trc = S.fromList (threads trc) == S.fromList (visits trc) -- | Get all threads created. threads :: D.Trace -> [D.ThreadId] threads trc = D.initialThread : mapMaybe go trc where go (_, _, D.Fork tid) = Just tid go _ = Nothing -- | Get all scheduled threads visits :: D.Trace -> [D.ThreadId] visits = mapMaybe go where go (D.Start tid, _, _) = Just tid go (D.SwitchTo tid, _, _) = Just tid go _ = Nothing

So now, given some way of setting up the game and running it to completion, we can test it and print some statistics:

-- | Run the prison game and print statistics. run :: D.Way -> (forall m. C.MonadConc m => m ()) -> IO () run way game = do traces <- map snd <$> D.runSCT way D.defaultMemType game let successes = filter isCorrect traces let failures = filter (not . isCorrect) traces putStrLn (show (length traces) ++ " total attempts") putStrLn (show (length successes) ++ " successes") putStrLn (show (length failures) ++ " failures") putStrLn (show (avgvisits successes) ++ " average number of room visits per success") putStrLn (show (avgvisits failures) ++ " average number of room visits per failure") putStrLn "Sample sequences of visits:" for_ (take 5 traces) (print . visits) where avgvisits ts = sum (map (fromIntegral . numvisits) ts) / genericLength ts numvisits = sum . map count where count (_, _, D.STM _ _) = 1 count (_, _, D.BlockedSTM _) = 1 count (_, _, D.Yield) = 1 count _ = 0

I have decided to assume that a prisoner will either yield (doing nothing) or perform some STM transaction while they’re in the room, to simplify things.

A slow but simple strategy is for the prisoners to nominate a leader. Only the leader can declare to the warden that everyone has visited the room. Whenever a prisoner other than the leader visits the room, if the light is *on*, they do nothing; otherwise, if this is their first time in the room with the light off, they turn it on, otherwise they leave it. Whenever the leader enters the room, they turn the light off. When the leader has turned the light off 99 times (or `1 - num_prisoners`

times), they tell the warden that everyone has visited.

Let’s set up those algorithms:

-- | The state of the light bulb. data Light = IsOn | IsOff -- | Count how many prisoners have toggled the light and terminate -- when everyone has. leader :: C.MonadConc m => Int -> C.TVar (C.STM m) Light -> m () leader prisoners light = go 0 where go counter = do counter' <- C.atomically $ do state <- C.readTVar light case state of IsOn -> do C.writeTVar light IsOff pure (counter + 1) IsOff -> C.retry when (counter' < prisoners - 1) (go counter') -- | Turn the light on once then do nothing. notLeader :: C.MonadConc m => C.TVar (C.STM m) Light -> m () notLeader light = do C.atomically $ do state <- C.readTVar light case state of IsOn -> C.retry IsOff -> C.writeTVar light IsOn forever C.yield

So now we just need to create a program where the leader is the main thread and everyone else is a separate thread:

-- | Most popular English male and female names, according to -- Wikipedia. name :: Int -> String name i = ns !! (i `mod` length ns) where ns = ["Oliver", "Olivia", "George", "Amelia", "Harry", "Emily"] -- | Set up the prison game. The number of prisoners should be at -- least 1. prison :: C.MonadConc m => Int -> m () prison prisoners = do light <- C.atomically (C.newTVar IsOff) for_ [1..prisoners-1] (\i -> C.forkN (name i) (notLeader light)) leader prisoners light

Because these are people, not just threads, I’ve given them names. The leader is just called “main” though, how unfortunate for them.

Now we can try out our system and see if it works:

λ> let runS = run $ D.systematically (D.defaultBounds { D.boundPreemp = Nothing }) λ> runS 1 1 total attempts 1 successes 0 failures 2.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main] λ> runS 2 5 total attempts 5 successes 0 failures 7.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main,Olivia,main] [main,Olivia,main,Olivia,main] [main,Olivia,main,Olivia,main] [main,Olivia,main,Olivia,main] [main,Olivia,main] λ> runS 3 2035 total attempts 2035 successes 0 failures 133.39066339066338 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: (big lists omitted)

This doesn’t scale well. It’s actually a really bad case for concurrency testing: every thread is messing with the same shared state, so dejafu has to try all the orderings. Not good.

Taking another look at our prisoners, we can see two things which a human would use to decide whether some schedules are redundant or not:

If we adopt any schedule other than alternating leader / non-leader, threads will block without doing anything. So we should alternate.

When a non-leader has completed their task, they will always yield. So we should never schedule a prisoner who will yield.

Unfortunately dejafu can’t really make use of (1). It could be inferred *if* dejafu was able to compare values inside `TVar`

s, rather than just seeing that there had been a write. But Haskell doesn’t let us do that without slapping an `Eq`

constraint on `writeTVar`

, which I definitely don’t want to do (although maybe having a separate `eqwriteTVar`

, `eqputMVar`

, and so on would be a nice addition).

Fortunately, dejafu *can* do something with (2). It already bounds the maximum number of times a thread can yield, so that we can test constructs like spinlocks. This is called *fair bounding*. The default bound is 5, but if we set it to 0 dejafu will just never schedule a thread which is going to yield. Here we go:

λ> let runS = run $ D.systematically (D.defaultBounds { D.boundPreemp = Nothing, D.boundFair = Just 0 }) λ> runS 1 1 total attempts 1 successes 0 failures 2.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main] λ> runS 2 1 total attempts 1 successes 0 failures 4.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main] λ> runS 3 4 total attempts 4 successes 0 failures 7.5 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main,George,main] [main,Olivia,George,main,George,main] [main,George,main,Olivia,main] [main,George,Olivia,main,Olivia,main]

Much better! Although it still doesn’t scale as nicely as we’d like

λ> runS 4 48 total attempts 48 successes 0 failures 11.5 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main,George,main,Amelia,main] [main,Olivia,main,George,Amelia,main,Amelia,main] [main,Olivia,main,Amelia,main,George,main] [main,Olivia,main,Amelia,George,main,George,main] [main,Olivia,George,main,George,main,Amelia,main] λ> runS 5 1536 total attempts 1536 successes 0 failures 16.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main,George,main,Amelia,main,Harry,main] [main,Olivia,main,George,main,Amelia,Harry,main,Harry,main] [main,Olivia,main,George,main,Harry,main,Amelia,main] [main,Olivia,main,George,main,Harry,Amelia,main,Amelia,main] [main,Olivia,main,George,Amelia,main,Amelia,main,Harry,main] λ> runS 6 122880 total attempts 122880 successes 0 failures 21.0 average number of room visits per success NaN average number of room visits per failure Sample sequences of visits: [main,Olivia,main,George,main,Amelia,main,Harry,main,Emily,main] [main,Olivia,main,George,main,Amelia,main,Harry,Emily,main,Emily,main] [main,Olivia,main,George,main,Amelia,main,Emily,main,Harry,main] [main,Olivia,main,George,main,Amelia,main,Emily,Harry,main,Harry,main] [main,Olivia,main,George,main,Amelia,Harry,main,Harry,main,Emily,main]

The prisoners are stepping on each other’s toes and causing needless work. This is probably as good as we can do without adding some extra primitives to dejafu to optimise the case where we have an `Eq`

instance available, unfortunately.

In concurrency testing terms, six threads is actually quite a lot.

Empirical studies have found that many concurrency bugs can be exhibited with only two or three threads! Furthermore, most real-world concurrent programs don’t have every single thread operating on the same bit of shared state.

There’s another school of thought which says to just wait for three years, because by then it’s very unlikely that any single prisoner had never visited the room. In fact, we would expect each prisoner to have been to the room ten times by then, assuming the warden is fair.

By keeping track of how many days have passed, we can try this out as well:

leader :: C.MonadConc m => Int -> C.TVar (C.STM m) Int -> m () leader prisoners days = C.atomically $ do numDays <- C.readTVar days C.check (numDays >= (prisoners - 1) * 10) notLeader :: C.MonadConc m => C.TVar (C.STM m) Int -> m () notLeader days = forever . C.atomically $ C.modifyTVar days (+1) prison :: C.MonadConc m => Int -> m () prison prisoners = do days <- C.atomically (C.newTVar 0) for_ [1..prisoners-1] (\i -> C.forkN (name i) (notLeader days)) leader prisoners days

Now let’s see how these brave prisoners do (sample visit sequences omitted because they’re pretty long):

λ> let runR = run $ D.uniformly (R.mkStdGen 0) 100 λ> runR 1 100 total attempts 100 successes 0 failures 2.0 average number of room visits per success NaN average number of room visits per failure λ> runR 2 100 total attempts 100 successes 0 failures 18.35 average number of room visits per success NaN average number of room visits per failure λ> runR 3 100 total attempts 100 successes 0 failures 31.92 average number of room visits per success NaN average number of room visits per failure λ> runR 4 100 total attempts 100 successes 0 failures 43.52 average number of room visits per success NaN average number of room visits per failure λ> runR 5 100 total attempts 100 successes 0 failures 55.88 average number of room visits per success NaN average number of room visits per failure λ> runR 6 100 total attempts 100 successes 0 failures 67.37 average number of room visits per success NaN average number of room visits per failure λ> runR 7 100 total attempts 100 successes 0 failures 77.05 average number of room visits per success NaN average number of room visits per failure λ> runR 8 100 total attempts 99 successes 1 failures 90.4040404040404 average number of room visits per success 81.0 average number of room visits per failure λ> runR 9 100 total attempts 100 successes 0 failures 101.64 average number of room visits per success NaN average number of room visits per failure λ> runR 10 100 total attempts 100 successes 0 failures 114.89 average number of room visits per success NaN average number of room visits per failure

Not bad at all! Although my puny VPS still can’t manage all 100.

]]>As before, all code is available on GitHub. The code for this post is under the “post-02” tag.

Did you do last time’s homework task? It was to implement this interface:

data CRef m a = -- ... newCRef :: a -> MiniFu m (CRef m a) readCRef :: CRef m a -> MiniFu m a writeCRef :: CRef m a -> a -> MiniFu m () atomicModifyCRef :: CRef m a -> (a -> (a, b)) -> MiniFu m b

Here are my solutions, available at the “homework-01” tag:

- (
`2070bdf`

) Add the`CRef`

type, the`PrimOp`

constructors, and the wrapper functions - (
`188eec5`

) Implement the primops

I also made some changes, available at the “pre-02” tag:

- (
`7ce6e41`

) Add a helper for primops which don’t create any identifiers - (
`2419796`

) Move some definitions into an internal module - (
`9c49f9d`

) Change the type of the`block`

helper to`MVarId -> Threads m -> Threads m`

- (
`dabd84b`

) Implement`readMVar`

Now on to the show…

We can’t implement exceptions with what we have already. We’re going to need some new primops. I think you’re getting a feel for how this works now, so I won’t drag this out. Here we go:

import qualified Control.Exception as E data PrimOp m where -- ... Throw :: E.Exception e => e -> PrimOp m Catch :: E.Exception e => MiniFu m a -> (e -> MiniFu m a) -> (a -> PrimOp m) -> PrimOp m PopH :: PrimOp m -> PrimOp m throw :: E.Exception e => e -> MiniFu m a throw e = MiniFu (K.cont (\_ -> Throw e)) catch :: E.Exception e => MiniFu m a -> (e -> MiniFu m a) -> MiniFu m a catch act h = MiniFu (K.cont (Catch act h))

Throwing an exception with `throw`

jumps back to the closest enclosing `catch`

with an exception handler of the appropriate type, killing the thread if there is none. The `PopH`

primop will pop the top exception handler from the stack. We’ll insert those as appropriate when entering a `catch`

.

Before we can actually implement these primops, we need to give threads a place to store their exception handlers. You might have guessed it when I said “stack”: we’ll just give every thread a list of them. This requires changing our `Thread`

type and `thread`

function:

data Thread m = Thread { threadK :: PrimOp m , threadBlock :: Maybe MVarId , threadExc :: [Handler m] -- <- new } data Handler m where Handler :: E.Exception e => (e -> PrimOp m) -> Handler m thread :: PrimOp m -> Thread m thread k = Thread { threadK = k , threadBlock = Nothing , threadExc = [] -- <- new }

As `Exception`

is a subclass of `Typeable`

, given some exception value we’re able to look for the first matching handler:

raise :: E.Exception e => e -> Thread m -> Maybe (Thread m) raise exc thrd = go (threadExc thrd) where go (Handler h:hs) = case h <$> E.fromException exc' of Just pop -> Just (thrd { threadK = pop, threadBlock = Nothing, threadExc = hs }) Nothing -> go hs go [] = Nothing exc' = E.toException exc

If `raise`

returns a `Just`

, then a handler was found and entered. Otherwise, no handler exists and the thread should be removed from the `Threads`

collection. This can be expressed rather nicely as `M.update . raise`

.

Now we have enough support to implement the primops:

stepThread {- ... -} where -- ... go (Throw e) = simple (M.update (raise e) tid) go (Catch (MiniFu ma) h k) = simple . adjust $ \thrd -> thrd { threadK = K.runCont ma (PopH . k) , threadExc = let h' exc = K.runCont (runMiniFu (h exc)) k in Handler h' : threadExc thrd } go (PopH k) = simple . adjust $ \thrd -> thrd { threadK = k , threadExc = tail (threadExc thrd) }

Let’s break that down:

`Throw`

just re-uses our`raise`

function to either jump to the exception handler or kill the thread.`Catch`

changes the continuation of the thread to run the enclosed action, then do a`PopH`

action, then run the outer action. It also adds an exception continuation, which just runs the exception handler, then runs the outer action.`PopH`

just removes the head exception continuation.

It’s important that the exception continuation *doesn’t* use `PopH`

to remove itself: that happens in `raise`

when an exception is thrown. When writing this section I realised I’d made that mistake in dejafu (#139)!

So now we can use synchronous exceptions! Here’s an incredibly contrived example:

{-# LANGUAGE ScopedTypeVariables #-} import Control.Monad (join) example_sync :: MiniFu m Int example_sync = do a <- newEmptyMVar fork (putMVar a (pure 1)) fork (putMVar a (throw E.NonTermination)) fork (putMVar a (throw E.AllocationLimitExceeded)) catch (catch (join (readMVar a)) (\(_ :: E.AllocationLimitExceeded) -> pure 2)) (\(_ :: E.NonTermination) -> pure 3) demo_sync :: IO () demo_sync = do g <- R.newStdGen print . fst =<< minifu randomSched g example_sync

If we run this a few times in ghci, we can see the different exceptions being thrown and caught (resulting in different outputs):

λ> demo_sync Just 1 λ> demo_sync Just 3 λ> demo_sync Just 3 λ> demo_sync Just 2

`MonadConc`

has a bunch of superclasses, and we can now implement two of them!

import qualified Control.Monad.Catch as EM instance EM.MonadThrow (MiniFu m) where throwM = -- 'throw' from above instance EM.MonadCatch (MiniFu m) where catch = -- 'catch' from above

The exceptions package provides the `MonadThrow`

, `MonadCatch`

, and `MonadMask`

typeclasses, so we can talk about exceptions in a wider context than just `IO`

. We’ll get on to `MonadMask`

when we look at asynchronous exceptions.

It is with exceptions that we hit the first thing we can’t do in MiniFu.

When in `IO`

, we can catch exceptions from pure code:

λ> import Control.Exception λ> evaluate undefined `catch` \e -> putStrLn ("Got " ++ show (e :: SomeException)) Got Prelude.undefined CallStack (from HasCallStack): error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err undefined, called at <interactive>:5:10 in interactive:Ghci2

But we can’t do that in `MiniFu`

, as there’s no suitable `evaluate`

function.

Should there be an `evaluate`

in the `MonadConc`

class? I’m unconvinced, as it’s not really a *concurrency* operation.

Should we constrain the `m`

in `MiniFu m`

to be a `MonadIO`

, which would let us call `evaluate`

? Perhaps, that would certainly be a way to do it, and I’m currently investigating the advantages of an `IO`

base monad for dejafu (although originally for a different reason).

Asynchronous exceptions are like synchronous exceptions, except for two details:

- They are thrown to a thread identified by
`ThreadId`

. We can do this already with`raise`

. - Raising the exception may be blocked due to the target thread’s
*masking state*. We need to do some extra work to implement this.

When a thread is masked, attempting to deliver an asynchronous exception to it will block. There are three masking states:

`Unmasked`

, asynchronous exceptions are unmasked.`MaskedInterruptible`

, asynchronous exceptions are masked, but blocked operations may still be interrupted.`MaskedUninterruptible`

, asynchronous exceptions are masked, and blocked operations may not be interrupted.

So we’ll add the current masking state to our `Thread`

type, defaulting to `Unmasked`

, and also account for blocking on another thread:

data Thread m = Thread { threadK :: PrimOp m , threadBlock :: Maybe (Either ThreadId MVarId) -- <- new , threadExc :: [Handler m] , threadMask :: E.MaskingState -- <- new } thread :: PrimOp m -> Thread m thread k = Thread { threadK = k , threadBlock = Nothing , threadExc = [] , threadMask = E.Unmasked -- <- new }

We’ll also need a primop to set the masking state:

data PrimOp m where -- ... Mask :: E.MaskingState -> PrimOp m -> PrimOp m

Which has a fairly straightforward implementation:

stepThread {- ... -} where -- ... go (Mask ms k) = simple . adjust $ \thrd -> thrd { threadK = k , threadMask = ms }

Finally, we need to make sure that if an exception is raised, and we jump into an exception handler, the masking state gets reset to what it was when the handler was created. This means we need a small change to the `Catch`

primop:

stepThread {- ... -} where -- ... go (Catch (MiniFu ma) h k) = simple . adjust $ \thrd -> thrd { threadK = K.runCont ma (PopH . k) , threadExc = let ms0 = threadMask thrd -- <- new h' exc = flip K.runCont k $ do K.cont (\c -> Mask ms0 (c ())) -- <- new runMiniFu (h exc) in Handler h' : threadExc thrd }

Alright, now we have enough background to actually implement the user-facing operations.

To throw an asynchronous exception, we’re going to need a new primop:

data PrimOp m where -- ... ThrowTo :: E.Exception e => ThreadId -> e -> PrimOp m -> PrimOp m

Which has a corresponding wrapper function:

throwTo :: E.Exception e => ThreadId -> e -> MiniFu m () throwTo tid e = MiniFu (K.cont (\k -> ThrowTo tid e (k ())))

Let’s think about the implementation of the `ThrowTo`

primop. It first needs to check if the target thread is interruptible and, if so, raises the exception in that thread; if not, it blocks the current thread. A thread is interruptible if its masking state is `Unmasked`

, or `MaskedInterruptible`

and it’s currently blocked.

Let’s encapsulate that logic:

import Data.Maybe (isJust) isInterruptible :: Thread m -> Bool isInterruptible thrd = threadMask thrd == E.Unmasked || (threadMask thrd == E.MaskedInterruptible && isJust (threadBlock thrd))

Given that, the implementation of `ThrowTo`

is straightforward:

stepThread {- ... -} where -- ... go (ThrowTo threadid e k) = simple $ case M.lookup threadid threads of Just t | isInterruptible t -> goto k . M.update (raise e) threadid | otherwise -> block (Left threadid) Nothing -> goto k

First, check if the thread exists. Then check if it’s interruptible: if it is, raise the exception, otherwise block. If the thread doesn’t exist any more, just continue.

Now we just need to handle *unblocking* threads which are blocked in `ThrowTo`

. For that, we’ll go back to the `run`

function and add a pass to unblock threads if the current one is interruptible after it processes its action:

run :: C.MonadConc m => Scheduler s -> s -> PrimOp m -> m s run sched s0 = go s0 . initialise where go s (threads, idsrc) | initialThreadId `M.member` threads = case runnable threads of Just tids -> do let (chosen, s') = sched tids s (threads', idsrc') <- stepThread chosen (threads, idsrc) let threads'' = if (isInterruptible <$> M.lookup chosen threads') /= Just False then unblock (Left chosen) threads' else threads' -- ^- new go s' (threads'', idsrc') Nothing -> pure s | otherwise = pure s runnable = nonEmpty . M.keys . M.filter (isNothing . threadBlock) initialThreadId = fst (nextThreadId initialIdSource)

So after stepping a thread, we unblock every thread blocked on it if it either doesn’t exist, of if it does exist and is interruptible. It’s much more robust to do this once here than everywhere in `stepThread`

which might cause the thread to become interruptible.

There are two operations at the programmer’s disposal to change the masking state of a thread, `mask`

and `uninterruptibleMask`

. Here’s what the `MiniFu`

types will look like:

{-# LANGUAGE RankNTypes #-} mask :: ((forall x. MiniFu m x -> MiniFu m x) -> MiniFu m a) -> MiniFu m a uninterruptibleMask :: ((forall x. MiniFu m x -> MiniFu m x) -> MiniFu m a) -> MiniFu m a

Each takes an action to run, and runs it as either `MaskedInterruptible`

or `MaskedUninterruptible`

. The action is provided with a polymorphic callback to run a subcomputation with the original masking state.

This is going to need, you guessed it, a new primop! We *could* modify the `Mask`

primop to do this job as well, but I think it’s a little clearer to have two separate ones:

data PrimOp m where -- ... InMask :: E.MaskingState -> ((forall x. MiniFu m x -> MiniFu m x) -> MiniFu m a) -> (a -> PrimOp m) -> PrimOp m

And here’s the implementations of our masking functions:

mask ma = MiniFu (K.cont (InMask E.MaskedInterruptible ma)) uninterruptibleMask ma = MiniFu (K.cont (InMask E.MaskedUninterruptible ma))

We can now fulfil another requirement of `MonadConc`

: a `MonadMask`

instance!

instance MonadMask (MiniFu m) where mask = -- 'mask' from above uninterruptibleMask = -- 'uninterruptibleMask' from above

The very last piece of the puzzle for exception handling in MiniFu is to implement this `InMask`

primop. Its type looks quite intense, but the implementation is really not that bad. There are three parts:

stepThread {- ... -} where -- ... go (InMask ms ma k) = simple . adjust $ \thrd -> thrd { threadK = let ms0 = threadMask thrd -- (1) we need to construct the polymorphic argument function umask :: MiniFu m x -> MiniFu m x umask (MiniFu mx) = MiniFu $ do K.cont (\c -> Mask ms0 (c ())) x <- mx K.cont (\c -> Mask ms (c ())) pure x -- (2) we need to run the inner continuation, resetting the masking state -- when done in K.runCont (runMiniFu (ma umask)) (Mask ms0 . k) -- (3) we need to change the masking state , threadMask = ms }

The explicit type signature on `umask`

is needed because we’re using `GADTs`

, which implies `MonoLocalBinds`

, which prevents the polymorphic type from being inferred. We could achieve the same effect by turning on `NoMonoLocalBinds`

.

Now we have asynchronous exceptions, check it out:

example_async :: MiniFu m String example_async = do a <- newEmptyMVar tid <- fork (putMVar a "hello from the other thread") throwTo tid E.ThreadKilled readMVar a demo_async :: IO () demo_async = do g <- R.newStdGen print . fst =<< minifu randomSched g example_async

See:

λ> demo_async Just "hello from the other thread" λ> demo_async Just "hello from the other thread" λ> demo_async Nothing

We have come to the end of part 2! Again, I hope you enjoyed this post, any feedback is welcome. This is all on GitHub, and you can see the code we ended up with at the “post-02” tag.

Once again, I have some homework for you. Your task, should you choose to accept it, is to implement:

tryPutMVar :: MVar m a -> a -> MiniFu m Bool tryTakeMVar :: MVar m a -> MiniFu m (Maybe a) tryReadMVar :: MVar m a -> MiniFu m (Maybe a)

Solutions will be up in a few days, as before, at the “homework-02” tag.

Stay tuned because next time we’re going to implement STM: all of it in one go. Then we can finally get on to the testing.

Thanks to Will Sewell for reading an earlier draft of this post.

]]>