Here are a pair of constraints from the Solving Scheduling Problems with Integer Linear Programming memo about fairly distributing rota assignments amongst the available people:

\[ \forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } \forall r \in \mathcal R \text{, } X_p \geqslant A_{tpr} \] \[ \forall p \in \mathcal P \text{, } X_p \leqslant \sum_{t \in \mathcal T} \sum_{r \in \mathcal R} A_{tpr} \]It’s a bit dense, but it only has the necessary information. Now here’s the corresponding Python:

for slot in range(slots): for person in people: for role in roles: problem += is_assigned[person] >= assignments[slot, person, role] for person in people: problem += is_assigned[person] <= pulp.lpSum(assignments[slot, person, role] for slot in range(slots) for role in roles)

That’s a bit more verbose, takes up more space, we’ve got this `pulp.lpSum`

thing, this mysterious `problem`

variable.. I’d prefer to be able to write an ASCII equivalent of the mathematical form, and have the Python generated for me.

The source file for this memo is Literate Haskell, you can load it directly into GHCi. So here’s the necessary ceremony:

{-# LANGUAGE LambdaCase #-} {-# LANGUAGE GADTs #-} import Control.Monad.Trans.Class import Control.Monad.Trans.Except import Control.Monad.Trans.Reader import Control.Monad.Trans.State import Data.Foldable (for_) import Data.List (intercalate, sort) import Data.Maybe (listToMaybe)

Let’s begin!

My main motivation when coming up with the concrete syntax was “it should look like the maths, but be ASCII, and not be LaTeX because that would be a pain”. It should also be concise, in particular it shouldn’t be necessary to specify what type quantifiers range over (that should be inferred).

I’m going to use the basic rota generator described in the other memo as a running example.

An ILP language is, necessarily, not very expressive. So I decided on the following types:

- User-defined types (sets of values)
- Predicates
- Integers
- N-dimensional binary arrays
- N-dimensional integer arraysI’ve not implemented these because the example didn’t need them.

Predicates take parameters and arrays take indices, the types of these will also be modelled. A predicate which takes two integers is a different type to a predicate which takes an integer and a set-value.

We also want to distinguish between “parameter” variables, which the user of the model will supply, and “model” variables, which the model will solve for.

Here’s an example with some comments:

-- Define three new types type TimeSlot, Person, Role -- M is an input of type integer param integer M -- is_leave is an input of type (TimeSlot, Person) -> bool param predicate is_leave(TimeSlot, Person) -- A is a 3D binary array the solver will try to produce model binary A[TimeSlot, Person, Role] -- X is a 1D binary array (or "vector" if you like special-case -- terminology...) the solver will try to produce model binary X[Person]

Now we have our constraints and objective function:

-- In every time slot, each role is assigned to exactly one person forall t, r; sum{p} A[t,p,r] = 1 -- Nobody is assigned multiple roles in the same time slot forall t, p; sum{r} A[t,p,r] <= 1 -- Nobody is assigned a role in a slot they are on leave for forall p, t if is_leave(t,p), r; A[t,p,r] = 0 -- Nobody works too many shifts forall p; sum{t, r} A[t,p,r] <= M -- Assignments are fairly distributed forall t, p, r; X[p] >= A[t,p,r] forall p; X[p] <= sum{t} r A[t,p,r] maximise sum{p} X[p]

Look how concise they are! They don’t reference any types either!

Even though `forall`

and `sum`

are conceptually similar (bring a new variable into scope and do some sort of quantification) I picked different syntax for them because `forall`

introduces multiple actual constraints: one for each value of the user-defined type being quantified over. The `forall`

quantifier is part of the meta-language, the `sum`

quantifier is part of the ILP language.

Let’s talk abstract syntax. I didn’t want to write a parser, so we’ll skip over that. We’re now getting to our first real bit of Haskell code.

In the concrete syntax there’s only one `forall`

and it’s followed by a list of variables to quantify over, and then the constraint. To simplify the implementation, the abstract-syntax-`CForall`

has exactly one variable, and can either be followed by another `CForall`

or a `CCheck`

(the bit like `sum{r} A[t,p,r] <= 1`

).

A `CForall`

also contains an optional predicate restriction, which is expressed as the predicate’s name followed by the list of arguments.

type Name = String data Constraint a = CForall (TypedName a) (Maybe (Name, [Name])) (Constraint a) | CCheck Op (Expression a) (Expression a) deriving Eq data Op = OEq | OLt | OGt | OLEq | OGEq deriving Eq

We’ll talk about the `TypedName`

bit in the next section, but it’s essentially the name of the quantifier variable. Here are some examples, where `E1`

and `E2`

are placeholders for expressions:

**concrete:**`E1 <= E2`

**abstract:**`CCheck OEq E1 E2`

**concrete:**`forall x; E1 = E2`

**abstract:**`CForall (Untyped "x") Nothing (CCheck OEq E1 E2)`

**concrete:**`forall x, y if p(x, y); E1 < E2`

**abstract:**`CForall (Untyped "x") Nothing (CForall (Untyped "y") (Just ("p", ["x", "y"])) (CCheck OLt E1 E2))`

The expression language is a bit richer, there are more forms of expressions than there are constraints:

data Expression a = ESum (TypedName a) (Expression a) | EIndex Name [Name] | EVar Name | EConst Integer | EMul Integer (Expression a) | EAdd (Expression a) (Expression a) deriving Eq

Here are some more examples:

**concrete:**`1 + 2`

**abstract:**`EAdd (EConst 1) (EConst 2)`

**concrete:**`X[y,z]`

**abstract:**`EIndex "X" ["y", "z"]`

**concrete:**`3 * sum{i} X[i]`

**abstract:**`EMul 3 (ESum (Untyped "i") (EIndex "X" ["i"]))`

Reading terms expressed in this abstract syntax would be a bit of a pain, so here’s some pretty-printing:

instance Show (Constraint a) where show (CForall tyname (Just (rname, rargs)) c) = "forall " ++ show tyname ++ " if " ++ rname ++ "(" ++ strings rargs ++ "); " ++ show c show (CForall tyname _ c) = "forall " ++ show tyname ++ "; " ++ show c show (CCheck op expr1 expr2) = show expr1 ++ " " ++ show op ++ " " ++ show expr2 instance Show (Expression a) where show (ESum tyname expr) = "sum{" ++ show tyname ++ "} " ++ show expr show (EIndex name args) = name ++ "[" ++ strings args ++ "]" show (EVar name) = name show (EConst i) = show i show (EMul i expr) = show i ++ " * " ++ show expr show (EAdd expr1 expr2) = show expr1 ++ " + " ++ show expr2 instance Show Op where show OEq = "=" show OLt = "<" show OGt = ">" show OLEq = "<=" show OGEq = ">="

It looks like this:

λ> CForall (Untyped "x") Nothing (CForall (Untyped "y") (Just ("p", ["x", "y"])) (CCheck OLt (EMul 3 (ESum (Untyped "i") (EIndex "X" ["i"]))) (EConst 10))) forall x; forall y if p(x, y); 3 * sum{i} X[i] < 10

The `strings`

helper function used in `CForall`

and `EIndex`

just comma-separates a list of strings:

strings :: [String] -> String strings = intercalate ", "

Previously, I said we would have these types:

- User-defined types
- Predicates
- Integers
- N-dimensional binary arrays
- N-dimensional integer arrays (not actually implemented)

And we also need to distinguish between “parameter” variables and “model” variables. ILP solvers only operate on matrices, so actually what we have are three parameter types:

- User-defined types
- Predicates
- Integers

And two model types:

- N-dimensional binary arrays
- N-dimensional integer arrays

data Ty = ParamCustom Name | ParamInteger | ParamPredicate [Ty] | ModelBinary [Ty] deriving Eq

Remember the `TypedName`

in the constraint and expression abstract syntax? It was used wherever a new name was brought into scope: `CForall`

and `ESum`

. A `TypedName`

is either a `Name`

by itself or a `Name`

associated with a `Ty`

:

data IsTyped data IsUntyped data TypedName a where Untyped :: Name -> TypedName IsUntyped Typed :: Name -> Ty -> TypedName IsTyped instance Eq (TypedName a) where Untyped n1 == Untyped n2 = n1 == n2 Typed n1 ty1 == Typed n2 ty2 = n1 == n2 && ty1 == ty2

When generating code, we’ll need to know which types are being quantified over. So the type checker will fill in the types as it goes, turning our *untyped* expressions and constraints into *typed* expressions and constraints.

type UntypedConstraint = Constraint IsUntyped type UntypedExpression = Expression IsUntyped type TypedConstraint = Constraint IsTyped type TypedExpression = Expression IsTyped

And let’s add some pretty-printing for types too:

instance Show Ty where show (ParamCustom name) = "param<" ++ name ++ ">" show ParamInteger = "param<integer>" show (ParamPredicate args) = "param<predicate(" ++ strings (map show args) ++ ")>" show (ModelBinary args) = "model<binary[" ++ strings (map show args) ++ "]>" instance Show (TypedName a) where show (Untyped name) = name show (Typed name ty) = show ty ++ " " ++ name

Our running example is the set of basic rota constraints from the other memo. We’ve already seen the concrete syntax, here’s the abstract syntax:

type Binding = (Name, Ty) globals :: [Binding] globals = [ ("M", ParamInteger) , ("is_leave", ParamPredicate [ParamCustom "TimeSlot", ParamCustom "Person"]) , ("A", ModelBinary [ParamCustom "TimeSlot", ParamCustom "Person", ParamCustom "Role"]) , ("X", ModelBinary [ParamCustom "Person"]) ] constraints :: [UntypedConstraint] constraints = [ -- In every time slot, each role is assigned to exactly one person CForall (Untyped "t") Nothing (CForall (Untyped "r") Nothing (CCheck OEq (ESum (Untyped "p") (EIndex "A" ["t", "p", "r"])) (EConst 1))) -- Nobody is assigned multiple roles in the same time slot , CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CCheck OLEq (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"])) (EConst 1))) -- Nobody is assigned a role in a slot they are on leave for , CForall (Untyped "p") Nothing (CForall (Untyped "t") (Just ("is_leave", ["t", "p"])) (CForall (Untyped "r") Nothing (CCheck OEq (EIndex "A" ["t", "p", "r"]) (EConst 0)))) -- Nobody works too many shifts , CForall (Untyped "p") Nothing (CCheck OLEq (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"]))) (EVar "M")) -- Assignments are fairly distributed , CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CForall (Untyped "r") Nothing (CCheck OGEq (EIndex "X" ["p"]) (EIndex "A" ["t", "p", "r"])))) , CForall (Untyped "p") Nothing (CCheck OLEq (EIndex "X" ["p"]) (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"])))) ]

That’s pretty verbose, more than the Python! Good thing that I’d write a parser for this if I were doing it for real.

This is the hairy bit of the memo. I’ve not gone for any particular type inference algorithm, I just went for the straightforward way to do it for the syntax and types I had.

We’ll use a monad stack for the type checker:

type TcFun = ReaderT [Binding] (StateT [Binding] (Except String)) -- environment ^^^^^^^^^ -- unresolved free variables ^^^^^^^^^ -- error message ^^^^^^

To get a feel for how `TcFun`

is useful, let’s go through some utility functions.

**Type errors:**

typeError :: String -> TcFun a typeError = lift . lift . throwE eExpected :: String -> Name -> Maybe Ty -> TcFun a eExpected eTy name (Just aTy) = typeError $ "Expected " ++ eTy ++ " variable, but '" ++ name ++ "' is " ++ show aTy ++ " variable." eExpected eTy name Nothing = typeError $ "Expected " ++ eTy ++ " variable, but could not infer a type for '" ++ name ++ "'."

Throwing a type error is pretty important, so we’ll need a function for that, and for one of the more common errors.

**Looking up the type of a name (if it’s bound):**

getTy :: Name -> TcFun (Maybe Ty) getTy name = lookup name <$> ask

The bindings in the state are just to keep track of free variables, and are not used when checking something’s type.

**Running a subcomputation with a name removed from the environment:**

withoutBinding :: Name -> TcFun a -> TcFun a withoutBinding name = withReaderT (remove name)

For example, if we have a global `x`

and a constraint `forall x; A[x]`

, the `x`

in `A[x]`

is not the global `x`

; it’s the `x`

bound by the `forall`

. Don’t worry, it’s only removed while typechecking the body of the `CForall`

or `ESum`

which introduced the new binding.

**Asserting a variable has a type:**

assertType :: Name -> Ty -> TcFun () assertType name eTy = getTy name >>= \case Just aTy | eTy == aTy -> pure () | otherwise -> eExpected (show eTy) name (Just aTy) Nothing -> lift $ modify ((name, eTy):)

Takes a name and an expected type, and checks that any pre-existing binding matches. If there is no pre-existing binding, the name is introduced as a free variable.

**Removing a free variable from the state:**

delFree :: Name -> TcFun Ty delFree name = lookup name <$> lift get >>= \case Just ty -> do lift $ modify (remove name) pure ty Nothing -> typeError $ "Could not infer a type for '" ++ name ++ "'."

Looks up the type of a free variable, removes the variable from the state, and returns the type. If we fail to find a type for the variable, it’s unused, which is a type error (as we can’t infer a concrete type).

The basic idea is to walk through the abstract syntax: unify types when they arise; and for `CForall`

and `ESum`

check that the inner constraint (or expression) has a free variable with the right name and type.

**Typechecking argument lists:**

Let’s start with the simplest case: type checking an argument list, which arises in quantifier predicate constraints and `EIndex`

. The function takes the names of the argument variables and their expected types, and checks that the variables do have those types.

typecheckArgList :: [Name] -> [Ty] -> TcFun ()

The recursive case takes the name of the current argument and its expected type. It then looks up the actual type of the name. If it has a type, check that it’s the same as the expected type and either move onto the next parameter or throw an error. If there is no binding, `assertType`

records it as a free variable.

typecheckArgList (name:ns) (expectedTy:ts) = do assertType name expectedTy typecheckArgList ns ts

Ultimately the `typecheckExpression`

and `typecheckConstraint`

functions we’ll get to later will make sure all these free variables are bound by a `forall`

or a `sum`

.

The base case is when we run out of argument names or types; there should be the same number of each:

typecheckArgList [] [] = pure () typecheckArgList ns [] = typeError $ "Expected " ++ show (length ns) ++ " fewer arguments." typecheckArgList [] ts = typeError $ "Expected " ++ show (length ts) ++ " more arguments."

**Typechecking expressions:**

Expressions have a few different parts, so let’s go through them one at a time.

typecheckExpression :: UntypedExpression -> TcFun TypedExpression typecheckExpression e0 = decorate e0 (go e0) where

The `decorate`

function, defined further below, appends the pretty-printed expression to any error message. So by `decorate`

ing every recursive call, we get an increasingly wide view of the error. Like this:

Found variable 'x' at incompatible types param<integer> and param<index>. in x in A[x] = x in forall x; A[x] = x

`ESum`

introduces a new binding. The way I’ve handled this is by *unbinding* the name (in case there was something with the same name from a wider scope), type-checking the inner expression, and then (1) asserting that there is a free variable with the name of the bound variable and (2) storing its type.

go (ESum (Untyped name) expr) = do expr' <- withoutBinding name $ typecheckExpression expr delFree name >>= \case ty@(ParamCustom _) -> pure (ESum (Typed name ty) expr') aTy -> eExpected "param<$custom>" name (Just aTy)

`EIndex`

requires checking an argument list. I’m not allowing quantifying over model variables, so in the expression `EIndex name args`

, then `name`

*must* refer to a global. All globals are of known types, so we can look up the type of the argument list from the global environment.

go (EIndex name args) = getTy name >>= \case Just (ModelBinary argtys) -> do typecheckArgList args argtys pure (EIndex name args) aTy -> eExpected "model<binary(_)>" name aTy

`EVar`

uses a variable directly, in which case the variable *must* be an integer. This is handled by looking for a binding and, if there isn’t one, introducing a new free variable.

go (EVar name) = do assertType name ParamInteger pure (EVar name)

`EConst`

, `EMul`

, and `EAdd`

are pretty simple and just involve recursive calls to `typecheckExpression`

.

go (EConst k) = pure (EConst k) go (EMul k expr) = do expr' <- typecheckExpression expr pure (EMul k expr') go (EAdd expr1 expr2) = do expr1' <- typecheckExpression expr1 expr2' <- typecheckExpression expr2 pure (EAdd expr1' expr2')

The input to `typecheckExpression`

is an `UntypedExpression`

and the output is a `TypedExpression`

. We get there by rewriting `ESum`

constructs to contain the inferred type of the quantifier variable. This will be useful when generating code.

**Typechecking constraints:**

Typechecking a constraint is pretty much the same as typechecking an expression. `CForall`

is like `ESum`

, `CCheck`

is like `EAdd`

. The only new thing is that a `CForall`

can have a predicate constraint… but that’s typechecked in the same way as an `EIndex`

: get the argument types of the predicate from the environment, and check that against the argument variables.

Here it is:

typecheckConstraint :: UntypedConstraint -> TcFun TypedConstraint typecheckConstraint c0 = decorate c0 (go c0) where go (CForall (Untyped name) (Just (rname, rargs)) c) = getTy rname >>= \case Just (ParamPredicate argtys) -> do typecheckArgList rargs argtys c' <- withoutBinding name $ typecheckConstraint c ty <- delFree name pure (CForall (Typed name ty) (Just (rname, rargs)) c') aTy -> eExpected "param<predicate(_)>" rname aTy go (CForall (Untyped name) Nothing c) = do c' <- withoutBinding name $ typecheckConstraint c ty <- delFree name pure (CForall (Typed name ty) Nothing c') go (CCheck op expr1 expr2) = do expr1' <- typecheckExpression expr1 expr2' <- typecheckExpression expr2 pure (CCheck op expr1' expr2')

While `typecheckConstraint`

works, it leaves something to be desired. Here’s a slightly nicer interface:

typecheckConstraint_ :: [Binding] -> UntypedConstraint -> Either String TypedConstraint typecheckConstraint_ env0 c0 = check =<< runExcept (runStateT (runReaderT (typecheckConstraint c0) env0) []) where check (c, []) = Right c check (_, free) = Left ("Unbound free variables: " ++ strings (sort (map fst free)) ++ ".")

This:

- Takes the global bindings as an argument.
- Does away with the
`TcFun`

, it returns a plain`Either`

. - Checks that no free variables leak out.

Some utility functions used above are:

remove :: Eq a => a -> [(a, b)] -> [(a, b)] remove a = filter ((/=a) . fst) decorate :: Show a => a -> TcFun b -> TcFun b decorate e = goR where goR m = ReaderT (goS . runReaderT m) goS m = StateT (goE . runStateT m) goE = withExcept (\err -> err ++ "\n in " ++ show e) where

Here’s a little function to print out the inferred type, or type error, for all of our constraints from the running example:

demoTypeInference :: IO () demoTypeInference = for_ constraints $ \constraint -> do case typecheckConstraint_ globals constraint of Right c' -> print c' Left err -> putStrLn err putStrLn ""

Behold!

λ> demoTypeInference forall param<TimeSlot> t; forall param<Role> r; sum{param<Person> p} A[t, p, r] = 1 forall param<TimeSlot> t; forall param<Person> p; sum{param<Role> r} A[t, p, r] <= 1 forall param<Person> p; forall param<TimeSlot> t if is_leave(t, p); forall param<Role> r; A[t, p, r] = 0 forall param<Person> p; sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] <= M forall param<TimeSlot> t; forall param<Person> p; forall param<Role> r; X[p] >= A[t, p, r] forall param<Person> p; X[p] <= sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r]

Looks pretty good, all types are inferred as they should be.

Here’s a broken example, which arose when I mistyped one of the constraints:

λ> either putStrLn print $ typecheckConstraint_ globals (CForall (Untyped "t") Nothing (CForall (Untyped "p") Nothing (CForall (Untyped "r") Nothing (CCheck OLEq (EIndex "X" ["p"]) (ESum (Untyped "t") (ESum (Untyped "r") (EIndex "A" ["t", "p", "r"]))))))) Could not infer a type for 'r'. in forall r; X[p] <= sum{t} sum{r} A[t, p, r] in forall p; forall r; X[p] <= sum{t} sum{r} A[t, p, r] in forall t; forall p; forall r; X[p] <= sum{t} sum{r} A[t, p, r]

I’d added extra `forall t`

and `forall r`

quantifiers, which are wrong because those variables are bound by `sum`

s. So the types of the `forall`

-bound variables can’t be inferred.

I don’t want to write (or learn) bindings to ILP solvers, I already know PuLP so that sounds like a pain. So what I do want to do is generate the PuLP-using Python code the abstract syntax corresponds to.

Most of `codegenExpression`

, which produces a Python expression, is straightforward:

codegenExpression :: TypedExpression -> String codegenExpression (EIndex name args) = name ++ "[" ++ strings args ++ "]" codegenExpression (EVar name) = name codegenExpression (EConst i) = show i codegenExpression (EMul i expr) = show i ++ " * " ++ codegenExpression expr codegenExpression (EAdd expr1 expr2) = "(" ++ codegenExpression expr1 ++ " + " ++ codegenExpression expr2 ++ ")"

The complex bit is handling `ESum`

, which introduces a generator expression, and multiple `ESum`

s are collapsed:

codegenExpression (ESum tyname0 expr0) = go [tyname0] expr0 where go vars (ESum tyname expr) = go (tyname:vars) expr go vars e = "pulp.lpSum(" ++ codegenExpression e ++ " " ++ go' (reverse vars) ++ ")" go' [] = "" go' (Typed name (ParamCustom ty):vs) = let code = "for " ++ name ++ " in " ++ ty in if null vs then code else code ++ " " ++ go' vs

I’m making some assumptions about how variables and types are represented in Python:

I assume all names are the valid in Python, eg:

**abstract:**`EMul 3 (EIndex "A", ["i"])`

**code:**`3 * A[i]`

I assume user-defined types correspond to Python iterators, eg:

**abstract:**`ESum (Typed "x" (ParamCustom "X")) expr`

**code:**`pulp.lpSum(expr for x in X)`

.

These aren’t checked. Assumption (1) could be handled by restricting the characters in names (eg, to alphanumeric only). Assumption (2) would be handled if I were to implement the full abstract syntax, as generated code would be put in a function which takes all the parameter variables as arguments, and which creates the model variables. But this memo only implements expressions and constraints.

Generating code for constraints is nothing surprising, the only slight complication is needing to make sure the indentation works out when there are nested `CForall`

s:

codegenConstraint :: TypedConstraint -> String codegenConstraint = unlines . go where go (CForall (Typed name (ParamCustom ty)) (Just (rname, rargs)) c) = [ "for " ++ name ++ " in " ++ ty ++ ":" , " if not " ++ rname ++ "(" ++ strings rargs ++ "):" , " continue" ] ++ indent (go c) go (CForall (Typed name (ParamCustom ty)) _ c) = [ "for " ++ name ++ " in " ++ ty ++ ":" ] ++ indent (go c) go (CCheck op expr1 expr2) = let e1 = codegenExpression expr1 e2 = codegenExpression expr2 in ["problem += " ++ e1 ++ " " ++ cgOp op ++ " " ++ e2] cgOp OEq = "==" cgOp op = show op indent = map (" "++)

Here’s a little function to print out the generated code, or type error, for all of our constraints from the running example:

demoCodeGen :: IO () demoCodeGen = for_ constraints $ \constraint -> do case typecheckConstraint_ globals constraint of Right c' -> do putStrLn (" # " ++ show c') putStrLn (codegenConstraint c') Left err -> do putStrLn err putStrLn ""

Behold, again!

λ> demoCodeGen # forall param<TimeSlot> t; forall param<Role> r; sum{param<Person> p} A[t, p, r] = 1 for t in TimeSlot: for r in Role: problem += pulp.lpSum(A[t, p, r] for p in Person) == 1 # forall param<TimeSlot> t; forall param<Person> p; sum{param<Role> r} A[t, p, r] <= 1 for t in TimeSlot: for p in Person: problem += pulp.lpSum(A[t, p, r] for r in Role) <= 1 # forall param<Person> p; forall param<TimeSlot> t if is_leave(t, p); forall param<Role> r; A[t, p, r] = 0 for p in Person: for t in TimeSlot: if not is_leave(t, p): continue for r in Role: problem += A[t, p, r] == 0 # forall param<Person> p; sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] <= M for p in Person: problem += pulp.lpSum(A[t, p, r] for t in TimeSlot for r in Role) <= M # forall param<TimeSlot> t; forall param<Person> p; forall param<Role> r; X[p] >= A[t, p, r] for t in TimeSlot: for p in Person: for r in Role: problem += X[p] >= A[t, p, r] # forall param<Person> p; X[p] <= sum{param<TimeSlot> t} sum{param<Role> r} A[t, p, r] for p in Person: problem += X[p] <= pulp.lpSum(A[t, p, r] for t in TimeSlot for r in Role)

We’ve come to the end of my little language for defining ILP problems, but there is still more to be done if this were to become a fully-fledged language people could use. Here are some missing bits:

- A parser for the concrete syntax.
- More integer operations:
- Integer ranges, in addition to user-defined set types, for
`forall`

and`sum`

. - Arithmetic on integer indices.
- Comparisons, in addition to predicate functions, in
`forall`

guards.

- Integer ranges, in addition to user-defined set types, for
- The rest of the abstract syntax (along with typechecking and code generation): integer matrices, objective functions, and type and variable declarations.

For example, in the GOV.UK support rota, one of the constraints is that someone can’t be on support in two adjacent weeks. With integer ranges and arithmetic on integer indices, that could be expressed like so:

forall t in [1, N), p; (sum{r} A[t, p, r]) + (sum{r} A[t - 1, p, r]) <= 1

There’s also a small problem with the current abstract syntax: it’s a bit too flexible. This is not a valid ILP expression:

sum{foo} (sum{bar} A[foo, bar] + sum{baz} B[foo, baz])

Only direct `sum`

nesting is permitted. There are two ways to solve this. One is to change the abstract syntax to preclude it, maybe something like this:

data Void data TaggedExpression tag a = TESum !tag (SumExpression a) | TEIndex Name [Name] | TEVar Name | TEConst Integer | TEMul Integer (TaggedExpression tag a) | TEAdd (TaggedExpression tag a) (TaggedExpression tag a) data SumExpression a = SENest (TypedName a) (SumExpression a) | SEBreak (TaggedExpression Void a)

A `TaggedExpression Void`

can’t contain any more `TESum`

constructors, because the `Void`

type is uninhabited. Another option is to add a check, between parsing and typechecking, that there are no invalidly nested `ESum`

s.

You can express a bunch of interesting problems in terms of ILP, and there are solvers which do a pretty good job of finding good solutions quickly. One of those interesting problems is scheduling, and there’s a nice write-up of how PyCon uses an ILP solver to generate schedules.

Another problem is rota generation, which is after all just a sort of scheduling. I have implemented a rota generator for GOV.UK’s technical support, and this memo is about how it works.

What is a rota?

Well, there are a bunch of time slots \(\mathcal T\), roles \(\mathcal R\), and people \(\mathcal P\). We can represent the assignments as a 3D binary matrix:

\[ \begin{split} A_{tpr} = \begin{cases} 1,&\text{ if, in time }t\text{, person }p\text{ is scheduled in role }r\\ 0,&\text{otherwise} \end{cases} \end{split} \]Next we need some constraints on what a valid rota looks like.

For every pair of slots and roles, the sum of the assignments should be 1:

\[ \forall t \in \mathcal T \text{, } \forall r \in \mathcal R \text{, } \sum_{p \in \mathcal P} A_{tpr} = 1 \]For every pair of slots and people, the sum of the assignments should be 0 (if they’re not assigned anything) or 1 (if they are):

\[ \forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } \sum_{r \in \mathcal R} A_{tpr} \in \{0, 1\} \]We might give our people time off (how generous!), so there’s no point in generating a rota where someone gets scheduled during their time off.

Given a function \(leave : \mathcal P \mapsto 2^{\mathcal T}\), which gives the set of slots someone is on leave, then: for every pair of slots and people, all roles should be unassigned if the slot is in \(leave(p)\):

\[ \forall p \in \mathcal P \text{, } \forall t \in leave(p) \text{, } \forall r \in \mathcal R \text{, } A_{tpr} = 0 \]We might also have a maximum number of shifts any one person can be assigned to in a rota.

Given such a limit \(M\), then: for every person, the sum of the assignments across *all* slots should be less than or equal to \(M\):

If all we wanted was constraints, then we could use a SAT solver, and it would probably do a better job than an ILP solver as a SAT solver is *built* for solving boolean constraints! But there’s one thing which is more easily expressible to an ILP solver than a SAT solver: objective functions to optimise.

Given our above constraints, we will get *a* rota, but it might not be very fair. One person might be scheduled ten times, and another not at all. We can encourage the solver to be more fair by providing it with an objective which results in more people being assigned.

First we’ll need an auxiliary variable to check whether someone has been assigned at all:

\[ \begin{split} X_p = \begin{cases} 1,&\text{ if person }p\text{ has any assignments}\\ 0,&\text{otherwise} \end{cases} \end{split} \]We can use two new constraints to set the value of these \(X\) variables:

\[ \forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } \forall r \in \mathcal R \text{, } X_p \geqslant A_{tpr} \] \[ \forall p \in \mathcal P \text{, } X_p \leqslant \sum_{t \in \mathcal T} \sum_{r \in \mathcal R} A_{tpr} \]As both \(A_{tpr}\) and \(X_p\) are binary variables, this means \(X_p\) will be 1 if (first constraint) and only if (second constraint) person \(p\) has any assignments at all.

We then give an objective to the solver:

\[ \textbf{maximise } \sum_{p \in \mathcal P} X_p \]The only way to increase the value of the sum is by assigning roles to more people, so that is what the solver will do.

PuLP is a Python library for interfacing with ILP solvers. It provides a somewhat nicer interface than directly dealing with the matrices and vectors on which ILP solvers operate, letting us express constraints as equations much like I have here.

Here’s how to express the above with PuLP:

import pulp # Parameters slots = 0 people = [] roles = [] leave = {} max_assignments_per_person = 0 # Create the 'problem' problem = pulp.LpProblem("rota generator", sense=pulp.LpMaximize) # Create variables assignments = pulp.LpVariable.dicts("A", ((slot, person, role) for slot in range(slots) for person in people for role in roles), cat="Binary") is_assigned = pulp.LpVariable.dicts("X", people, cat="Binary") # Add constraints for slot in range(slots): for role in roles: # In every time slot, each role is assigned to exactly one person problem += pulp.lpSum(assignments[slot, person, role] for person in people) == 1 for person in people: # Nobody is assigned multiple roles in the same time slot problem += pulp.lpSum(assignments[slot, person, role] for role in roles) <= 1 for person, bad_slots in leave.items(): for slot in bad_slots: for role in roles: # Nobody is assigned a role in a slot they are on leave for problem += assignments[slot, person, role] == 0 for person in people: # Nobody works too many shifts problem += pulp.lpSum(assignments[slot, person, role] for slot in range(slots) for role in roles) <= max_assignments_per_person # Constrain 'is_assigned' auxiliary variable for slot in range(slots): for person in people: for role in roles: # If problem += is_assigned[person] >= assignments[slot, person, role] for person in people: # Only if problem += is_assigned[person] <= pulp.lpSum(assignments[slot, person, role] for slot in range(slots) for role in roles) # Add objective problem += pulp.lpSum(is_assigned[person] for person in people) # Solve with the Coin/Cbc solver problem.solve(pulp.solvers.COIN_CMD()) # Print the solution! for slot in range(slots): print(f"Slot {slot}:") for role in roles: for person in people: if pulp.value(assignments[slot, person, role]) == 1: print(f" {role}: {person}")

The quantifiers have become `for...in`

loops and the summations have become calls to `pulp.lpSum`

with a generator expression iterating over the values of interest, but other than that it’s fairly straightforward.

With the parameters:

slots = 5 people = ["Spongebob", "Squidward", "Mr. Crabs", "Pearl"] roles = ["Fry Cook", "Cashier", "Money Fondler"] leave = {"Mr. Crabs": [0,2,3,4]} max_assignments_per_person = 5

We get the output:

Slot 0: Fry Cook: Pearl Cashier: Squidward Money Fondler: Spongebob Slot 1: Fry Cook: Spongebob Cashier: Mr. Crabs Money Fondler: Pearl Slot 2: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Pearl Slot 3: Fry Cook: Spongebob Cashier: Pearl Money Fondler: Squidward Slot 4: Fry Cook: Squidward Cashier: Spongebob Money Fondler: Pearl

If you play around with this you might notice two things:

The rota you get is always the same.

If there is no rota which meets the constraints, you get rubbish out!

This is due to how Cbc works. If you try GLPK, a different solver, you’ll still get a deterministic rota, but if there isn’t one meeting the constraints you’ll (probably) get back an empty rota. Solving ILP in the general case is NP-complete, so solvers use heuristics. Both Cbc and GLPK are deterministic, but they differ in heuristics.

You can check the `problem.status`

to see if it’s solved or not:

if problem.status != pulp.constants.LpStatusOptimal: raise Exception("Unable to solve problem.")

Another way to make the solver go wrong is by having a wide range of values in your problem. I’m not sure why this can cause a problem, but it does.

A simple way to introduce randomisation is to add give the solver a randomly generated objective to maximise. For example, we can assign a score to every possible allocation, and try to maximise the overall score:

import random randomise = pulp.lpSum(random.randint(0, 1) * assignments[slot, person, role] for slot in range(slots) for person in people for role in roles)

As we want the actual objective function to take priority, scale it up:

# Add objective problem += pulp.lpSum(is_assigned[person] for person in people) * 100 + randomise

Now if we run the tool multiple times, we get different rotas:

$ python3 rota.py Slot 0: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Pearl Slot 1: Fry Cook: Pearl Cashier: Spongebob Money Fondler: Mr. Crabs Slot 2: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Pearl Slot 3: Fry Cook: Squidward Cashier: Spongebob Money Fondler: Pearl Slot 4: Fry Cook: Squidward Cashier: Pearl Money Fondler: Spongebob $ python3 rota.py Slot 0: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Pearl Slot 1: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Mr. Crabs Slot 2: Fry Cook: Spongebob Cashier: Squidward Money Fondler: Pearl Slot 3: Fry Cook: Pearl Cashier: Squidward Money Fondler: Spongebob Slot 4: Fry Cook: Squidward Cashier: Spongebob Money Fondler: Pearl

The downside to this approach is that we might accidentally generate a random objective which is really hard to maximise, making the solver do a lot of work when all we really want is an arbitrary solution.

The GOV.UK support rota is a bit more complex than the example above. A typical rota runs for 12 weeks, with 1 week being 1 slot, in the above parlance. There are two types of roles, and constraints about who can occupy which roles:

**In-hours support roles:***Primary in-hours*, must have been secondary in-hours at least three times.*Secondary in-hours*, must have been shadow at least two times.*Shadow*, must not have shadowed twice before. This role is optional.

**Out-of-hours support roles:***Primary on-call*, no special requirements.*Secondary on-call*, must have been primary on-call at least three timesThere’s an asymmetry there: the primary in-hours needs to be experienced, but the opposite is the case for on-call roles. This is intentional! If the primary on-call were more experienced, they would resolve every issue themselves and the less experienced one would never get to learn anything.

.

There are separate pools for each type: there are some people who can do in-hours support, some people who can do out-of-hours support, and some people who can do both.

To ensure individuals and teams aren’t over-burdened with support roles, there are some constraints about when people can be scheduled:

- Someone can’t be on in-hours support in two adjacent weeks.
- Two people on in-hours support in the same week (or adjacent weeks) can’t be on the same team.

And there is also a limit on the number of in-hours and out-of-hours roles someone can do across the entire rota.

The objective function is a bit more complex too:

- As above, we want to maximise the number of people on the rota.
- We want to maximise the number of weeks when the secondary in-hours has done it fewer than three times.
- We want to maximise the number of weeks when the primary out-of-hours has done it fewer than three times.
- We want to maximise the number of weeks with a shadow.

I won’t go through all of the constraints, as they’re mostly more of the same, but this is an example of a particularly interesting constraint, as it’s pretty hard to implement.

The logic here is simple, but the language of ILP is very limited: you can’t directly express `if...then`

-style constraints between variables. Now, this is fine if we want to limit the primary in-hours role to people who have been secondary in-hours at least three times *before this rota period*, as we can statically determine that:

But that’s too restrictive. If someone has been secondary in-hours two times before the start of the rota, and is secondary in-hours in one week, they should be able to be primary in-hours in subsequent weeks.

To work around this we’ll need some auxiliary variables.

Firstly, let’s record how many times someone has been a secondary at the start of each slot:

\[ \begin{split} S_{tp} = \begin{cases} \text{the number of times person }p\text{ has been a secondary before the start of this rota},&\text{ if }t = 0\\ S_{t-1,p} + A_{t-1,p,\text{secondary}},&\text{otherwise} \end{cases} \end{split} \]Unlike previous variables we’ve seen, this is not a binary variable. But it is still an integral variable. Translating the above into ILP constraints is straightforward:

\[ \forall p \in \mathcal P \text{, } S_{0,p} = \text{the number of times person }p\text{ (etc)} \] \[ \forall t \geqslant 1 \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } S_{tp} = S_{t-1,p} + A_{t-1,p,\text{secondary}} \]Now we can use a trick I found to encode conditionals in ILP. The trick is to introduce an auxiliary variable, \(D \in \{0,1\}\), and use constraints to ensure that \(D = 0\) when the condition goes one way, and \(D = 1\) when it goes the other.

Here is how we encode `if X > k then Y >= 0 else Y <= 0`

, where `k`

is constant:

Here \(X\) and \(Y\) are the ILP variables from our conditional, \(D\) is the auxiliary variable we introduced, and \(m\) is some large constant, way bigger than the possible maximum values of \(X\) or \(Y\). Let’s walk through this, firstly here’s the case where \(D = 0\):

\[ \begin{align} 0 &\lt X - k \\ 0 &\leqslant Y \\ X - k &\leqslant m \\ Y &\leqslant m \end{align} \]Because \(m\) is a large constant, the bottom two constraints are trivially true, so they can be removed. With a little rearranging, we have:

\[ \begin{align} k &\lt X \\ 0 &\leqslant Y \\ \end{align} \]So if \(D = 0\), \(X\) is strictly greater than \(k\) (the condition is true), and \(Y \geqslant 0\). That’s the true branch sorted!

Now let’s look at the \(D = 1\) branch:

\[ \begin{align} 0 &\lt X - k + m \\ 0 &\leqslant Y + m \\ X - k &\leqslant 0 \\ Y &\leqslant 0 \end{align} \]Because \(m\) is a large constant, this time we can get rid of the first two constraints. With a little rearranging, we get:

\[ \begin{align} X &\leqslant k \\ Y &\leqslant 0 \end{align} \]So if \(D = 1\), \(X\) is not strictly greater than \(k\) and \(Y\) is at most zero. Remember, the real “\(Y\)” we’re using is an \(A_{tpr}\) value, which is a binary value, so the overall effect is to specify that it must be zero. Adding a constraint \(Y \geqslant 0\) would do the same job.

Each conditional needs a fresh \(D\) variable. So adding these conditionals in results in a lot of extra variables and constraints:

\[ \begin{alignat*}{4} &\forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } & 0 &\lt S_{tp} - 2 + 999 \times D_{tp} \\ &\forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } &0 &\leqslant A_{tp,\text{primary}} + 999 \times D_{tp} \\ &\forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } &S_{tp} - 2 &\leqslant 999 \times (1 - D_{tp}) \\ &\forall t \in \mathcal T \text{, } \forall p \in \mathcal P \text{, } &A_{tp,\text{primary}} &\leqslant 999 \times (1 - D_{tp}) \end{alignat*} \]Here 2 has been substituted for \(k\), as someone needs to have been a secondary at least three times to be a primary; and 999 has been substituted for \(m\), which is larger than the number of secondary shifts someone could actually have done.

Let’s cover one more type of constraint: not over-burdening teams by taking all of their members away to be on support at once. This one is pretty simple, but does require a bit more information about the people, specifically, what team they’re on.

Given a function \(team : \mathcal P \mapsto 2^{\mathcal P}\), which gives the set of people on the same team as another, then: for every pair of slots and people, there should be no overlap in the in-hours assignments if the two people are on the same team:

\[ \forall t \in \mathcal T \text{, } \forall p_1 \in \mathcal P \text{, } \forall p_2 \neq p_1 \in team(p_1) \text{, } \\ \forall r_1 \in \{\text{primary}, \text{secondary}, \text{shadow}\} \text{, } \\ \forall r_2 \in \{\text{primary}, \text{secondary}, \text{shadow}\} \text{, } \\ A_{t,p_1,r_1} + A_{t,p_2,r_2} \leqslant 1 \]My GOV.UK rota generator is on GitHub, and also on Heroku as The Incredible Rota Machine.

I’ve timed it on my laptop by running it repeatedly overnight, and found that the time to generate a rota varies between about 10s and 15m, but the median is about 30s. I expect it’ll be slower on Heroku, though.

It’s already paying off, I saved the person who usually puts together the rota an hour and a half! A new rota is needed every quarter, and it took me three and a half days to make, so it’ll pay for itself in a mere four and a half years!

It was a fun project, and a neat thing to do in firebreak—the one-week “do whatever you want as long as it’s useful” gap we have between quarters—but probably not worth it if you’re looking to save a bit of time.

]]>