We can check if two expressions have equivalent behaviours, or if one has *fewer* behaviours than the other. Such properties can serve both as documentation and as regression tests.

Let’s dive straight into an example:

Is

`readMVar`

equivalent to a`takeMVar`

followed by a`putMVar`

?

We might phrase this property like so:

prop_read_equiv_take_put = sig readMVar `equivalentTo` sig (\v -> takeMVar v >>= putMVar v)

The property-testing uses *signatures*, where a signature tells dejafu how to (1) create a *new* state; (2) make some *observation* of the state; (3) concurrently *interfere* with the state in some way; and (4) the expression to evaluate.

Properties are monomorphic, so we can’t directly express a property about *any* `MVar`

, we need to pick a concrete type for its contents. Let’s just pick `Int`

:

type State = MVar ConcIO Int

Properties operate in the `ConcIO`

monad. There is no option to use `ConcST`

yet, as I couldn’t get a nice interface working which didn’t break type inference in GHCi.

The state is constructed using a pure *seed value* the property-checker generates. We want to consider both *full* and *empty* `MVar`

s, so we’ll ask it to supply a `Maybe Int`

:

type Seed = Maybe Int

The initialisation function we will include in the signature then just calls `newMVar`

or `newEmptyMVar`

as appropriate:

makeMVar :: Seed -> ConcIO State makeMVar (Just x) = newMVar x makeMVar Nothing = newEmptyMVar

Seed values are generated using LeanCheck, an enumerative property-based testing library.

We want to know if the `MVar`

contains a value when we observe it, and we want to know what that value is; another `Maybe`

:

type Observation = Maybe Int

It is important that the observation function does not block, so we use `tryReadMVar`

here rather than `readMVar`

or `takeMVar`

:

observeMVar :: State -> Seed -> ConcIO Observation observeMVar v _ = tryReadMVar v

It does not matter if making the observation has side-effects, so `tryTakeMVar`

would have been equally valid.

Our interference function will just mess with the value in the `MVar`

:

interfereMVar :: State -> Seed -> ConcIO () interfereMVar mvar mx = do tryTakeMVar mvar void . tryPutMVar mvar $ case mx of Just x -> (x+1) * 3000 Nothing -> 7000

As LeanCheck is enumerative, large values like 3000 and 7000 will stand out if the tool reports a failure.

Now we package these operations up into a signature:

sig :: (State -> ConcIO a) -> Sig State Observation Seed sig e = Sig { initialise = makeMVar , observe = observeMVar , interfere = interfereMVar , expression = void . e }

We could, of course, have defined all this inside `sig`

without the top-level functions and type synonyms.

Now we can test the property:

> check $ sig readMVar `equivalentTo` sig (\v -> takeMVar v >>= putMVar v) *** Failure: (seed Just 0) left: [(Nothing,Just 3000)] right: [(Nothing,Just 0),(Nothing,Just 3000),(Just Deadlock,Just 3000)]

We get a failure! This is because the left term is atomic, whereas the right is not: another thread writing to the `MVar`

has the opportunity to swoop in and insert a new value after the `takeMVar`

but before the `putMVar`

. The right has strictly more behaviours than the left.

We can capture this, by using a different comparison:

> check $ sig readMVar `strictlyRefines` sig (\v -> takeMVar v >>= putMVar v) +++ OK

To “strictly refine” something is to have a proper subset of the behaviour. There is also a “refines” comparison, which does not require the subset to be proper.

Doesn’t

`readMVar v`

return a different thing to`takeMVar v >>= putMVar v`

?*Yes!*If they return at all, the former returns the value in the

`MVar`

, whereas the latter returns unit. Properties do not care about the return value of an expression, only the effects.You can see this by looking at the definition of

`sig`

again: it throws away the result of the expression using`void`

.Both of our properties are of the form

`sig f `cmp` sig g`

, can’t that redundancy be removed?*No!*You can use

*different*signatures with*different*state types! As long as the seed and observation types are the same,`check`

can compare them.You can use this to compare different implementations of a similar concurrent data structure.

Properties can also have arguments, using LeanCheck to generate their values. This doesn’t work in any mysterious way, here’s a property about the `QSemN`

functions:

> check $ \x y -> sig' (\v -> signalQSemN v (x + y)) `equivalentTo` sig' (\v -> signalQSemN v x >> signalQSemN v y) *** Failure: -1 1 (seed 0) left: [(Nothing,0)] right: [(Nothing,0),(Just Deadlock,0)]

You can even use your own types, as long as they have a `Listable`

(the typeclass LeanCheck uses) instance:

> :{ newtype Nat n = Nat n deriving Show instance (Num n, Ord n, Listable n) => Listable (Nat n) where list = [Nat n | n <- list, n >= 0] :} > check $ \(Nat x) (Nat y) -> sig' (\v -> signalQSemN v (x + y)) `equivalentTo` sig' (\v -> signalQSemN v x >> signalQSemN v y) +++ OK!

Currently it’s a bit slow as I need to fiddle with the implementation and work out what a good number of tests to run is. `check`

uses 10 seed values with 100 variable assignments each (1000 tests total), you can use `checkFor`

to reduce that.

So there you have it, property-testing for side-effects of stateful operations.

This has come out of my work on CoCo, a tool for automatically *discovering* these properties (paper here). In the CoCo repository are a few more examples. CoCo is still a work-in-progress, but one of the goals is to be able to generate dejafu-compatible output, so that CoCo can discover properties which dejafu can immediately begin using for regression testing.

As with the other CoCo memos, this is a Literate Haskell file. In the rendered output, highlighted source is literate source, non-highlighted source is just commentary.

We’re going to re-use the `Typeable`

machinery as much as possible, as implementing it all ourself is nasty.

{-# LANGUAGE DataKinds #-} {-# LANGUAGE PolyKinds #-} import Data.Function (on) import Data.List (nub) import Data.Maybe (fromMaybe, isJust) import Data.Proxy (Proxy(..)) import Data.Typeable

Firstly, we need a representation of type variables. We can’t use actual type variables, because `Typeable`

only works with monomorphic types. So we’ll need to introduce some specific types that we shall treat as variables during type-checking. We can get an arbitrary number of these by using a polymorphic type:

data TyVar t = TyVar deriving (Bounded, Enum, Eq, Ord, Read, Show)

For convenience, we shall also have four named type variables:

type A = TyVar 0 type B = TyVar 1 type C = TyVar 2 type D = TyVar 3

If any type needs more than four distinct type variables, it can always introduce its own.

Finally, we can check if a type is a type variable:

isTypeVar :: TypeRep -> Bool isTypeVar = ((==) `on` (fst . splitTyConApp)) (typeRep (Proxy :: Proxy A))

The Data.Typeable module has a handy little function for checking that a function application is well-typed:

funResultTy :: TypeRep -> TypeRep -> Maybe TypeRep

which takes the function type, the argument type, and returns the result type if the application is well-typed. We want something similar, but supporting our type variables. We don’t just want equality, want *unification*. We want a function like so:

-- | Applies a type to a function type. If type-correct, returns an environment binding type -- variables to types, and the result type (with bindings applied). polyFunResultTy :: TypeRep -> TypeRep -> Maybe (TypeEnv, TypeRep) polyFunResultTy fty aty = do -- get the function and argument types let funTyCon = typeRepTyCon (typeRep (Proxy :: Proxy (() -> ()))) (argTy, resultTy) <- case splitTyConApp fty of (con, [argTy, resultTy]) | con == funTyCon -> Just (argTy, resultTy) _ -> Nothing -- produce a (possibly empty) type environment env <- unify aty argTy -- apply the type environment to the result pure (env, applyBindings env resultTy)

**Type Environments:** Firstly, let’s define what a type environment actually is. We’ll keep it very simple: just a map from types to types:

-- | An environment of type bindings. type TypeEnv = [(TypeRep, TypeRep)]

The first in every tuple will be a `TyVar`

, but there isn’t really a good way to statically enforce that.

Now that we have type environments, let’s apply them to a type. This need not make a type fully monomorphic, the bindings may not cover every type variable used, or may define some type variables in terms of others (as long as a variable isn’t defined in terms of itself). This is fine.

-- | Apply type environment bindings to a type. applyBindings :: TypeEnv -> TypeRep -> TypeRep applyBindings env = go where go ty -- look up type variables in the environment, but fall-back to the naked variable if not found | isTypeVar ty = fromMaybe ty (lookup ty env) -- otherwise continue recursively through type constructors | otherwise = let (con, args) = splitTyConApp ty in mkTyConApp con (map go args)

**Unification:** For reasons that will become apparent later, we’re going to define a few variants of this.

Firstly, our standard unification function. It’ll take two types (the order doesn’t matter) and attempt to unify them. Two types unify if:

- They’re structurally equal; OR
- At least one is a type variable; OR
- They have the same constructor with the same number of arguments, and all the arguments unify, with compatible environments.

Hey, that sounds like a recursive function!

-- | Attempt to unify two types. unify :: TypeRep -> TypeRep -> Maybe [(TypeRep, TypeRep)] unify = unify' True

This next function is the actual workhorse. It implements the recursive decision procedure described above and constructs the environment. It takes a flag to determine if unifying with a naked type variable is allowed here. It always is in the recursive case. This will turn out to be useful in the next section, when we’re talking about polymorphic uses of the state type.

-- | Attempt to unify two types. unify' :: Bool -- ^ Whether to allow either type to be a naked type variable at this level (always true in -- lower levels). -> TypeRep -> TypeRep -> Maybe TypeEnv unify' b tyA tyB -- check equality | tyA == tyB = Just [] -- check if one is a naked type variable | isTypeVar tyA = if not b || occurs tyA tyB then Nothing else Just [(tyA, tyB)] | isTypeVar tyB = if not b || occurs tyB tyA then Nothing else Just [(tyB, tyA)] -- deconstruct each and attempt to unify subcomponents | otherwise = let (conA, argsA) = splitTyConApp tyA (conB, argsB) = splitTyConApp tyB in if conA == conB && length argsA == length argsB then unifyAccum True id argsA argsB else Nothing where -- check if a type occurs in another occurs needle haystack = needle == haystack || any (occurs needle) (snd (splitTyConApp haystack))

The recursive listy case is handled by this `unifyAccum`

function, which is mutually recursive with `unify'`

:

-- | An accumulating unify: attempts to unify two lists of types pairwise and checks that the -- resulting assignments do not conflict with the current type environment. unifyAccum :: Bool -> (Maybe TypeEnv -> Maybe TypeEnv) -> [TypeRep] -> [TypeRep] -> Maybe TypeEnv unifyAccum b f as bs = foldr go (Just []) (zip as bs) where go (tyA, tyB) (Just env) = unifyTypeEnvs b env =<< f (unify' b tyA tyB) go _ Nothing = Nothing

The final piece of the unification puzzle is how to combine type environments. This is necessary to be able to unify types like `T A A`

and `T Int B`

. One option is to enforce equality of bindings, but that is too restrictive (it won’t work in the `T`

example, as `Int`

is not `B`

, yet both *do* unify). The correct solution is to unify the bindings. This is yet another mutually recursive function:

-- | Unify two type environments, if possible. unifyTypeEnvs :: Bool -> TypeEnv -> TypeEnv -> Maybe TypeEnv unifyTypeEnvs b env1 env2 = foldr go (Just []) (nub $ map fst env1 ++ map fst env2) where go tyvar acc@(Just env) = case (lookup tyvar env, lookup tyvar env1, lookup tyvar env2) of (_, Just ty1, Just ty2) -> unifyTypeEnvs b env . ((tyvar, ty1):) =<< unify' b ty1 ty2 (x, Just ty1, _) | isJust x -> unifyTypeEnvs b env [(tyvar, ty1)] | otherwise -> Just ((tyvar, ty1):env) (x, _, Just ty2) | isJust x -> unifyTypeEnvs b env [(tyvar, ty2)] | otherwise -> Just ((tyvar, ty2):env) _ -> acc go _ Nothing = Nothing

And now, an example:

λ> data T a b = T λ> unify (typeOf (undefined :: T A A)) (typeOf (undefined :: T Int B)) Just [(TyVar Nat 0,TyVar Nat 1),(TyVar Nat 1,Int)] λ> let funTy = typeOf (undefined :: A -> B -> Bool -> Either A B) λ> polyFunResultTy funTy (typeOf (undefined::Int)) Just ([(TyVar Nat 0,Int)],TyVar Nat 1 -> Bool -> Either Int (TyVar Nat 1))

In a CoCo signature, the state type is monomorphic, but it can be nice to treat it as polymorphic for two reasons:

- Needing to change the types inside the signature because the type
*of the signature*changed is a pain. - It helps avoid repetition.

Here are a couple of function types that may appear in a signature:

MVar Concurrency Int -> Concurrency Int MVar Concurrency Int -> Int -> Concurrency ()

If the state type is `MVar Concurrency Int`

, then (1) is saying that if we change it to `MVar Concurrency Bool`

we now need to change those two types above, and (2) is saying that we’re needlessly repeating the `Int`

. It would be much nicer to have these types in the signature:

MVar Concurrency A -> Concurrency A MVar Concurrency A -> A -> Concurrency ()

Now that we have implemented type unification, we can do this! For a function type in the signature:

- Try to unify every argument,
*excluding naked type variables*, against the state type. - Check that the environments are compatible.
- Apply the combined environment to the function type.

This effect of this is to monomorphise polymorphic uses of the state type. This is good because the state type often determines other argument types, and once we know the concrete types of function arguments we can infer what hole types we need. If functions have totally polymorphic types, hole inference doesn’t work so well.

-- | Monomorphise polymorphic uses of the state type in a function type. monomorphise :: Typeable s => Proxy s -- ^ The state type. -> TypeRep -- ^ The function type. -> TypeRep monomorphise s ty0 = fromMaybe ty0 $ do let stateTy = typeRep s argTys <- funArgTys ty0 env <- unifyAccum False (maybe (Just []) Just) (repeat stateTy) argTys pure (applyBindings env ty0)

Now we see the purpose of the boolean argument to `unify'`

. If we have the type `A -> MVar Concurrency A -> Concurrency ()`

, we don’t want to unify `MVar Concurrency Int`

with `A`

, we want to skip that over! More generally, we want to avoid unifying a fully-polymorphic type with our state type but, as all our type variables have kind `*`

, just preventing the top-level unification suffices.

Oh, we’ll also need this helper function to get all the argument types of a function:

-- | Get all of a function's argument types. Returns @Nothing@ if not a function type. funArgTys :: TypeRep -> Maybe [TypeRep] funArgTys ty = case splitTyConApp ty of (con, [argTy, resultTy]) | con == funTyCon -> Just $ argTy : fromMaybe [] (funArgTys resultTy) _ -> Nothing where funTyCon = typeRepTyCon (typeRep (Proxy :: Proxy (() -> ())))

And we’re done!

λ> let s = Proxy :: Proxy (Either Int Bool) λ> let f = typeOf (undefined :: A -> Either A B -> B) λ> monomorphise s f Int -> Either Int Bool -> Bool]]>

The problem is: given two terms (from different schemas) and their results, how do we compare them? For instance, consider we have two terms:

term1 = f (x :: Int) (y :: Int) (z :: Bool) term2 = g (x :: Int) (y :: Bool) (z :: Bool)

Which fulfil the property that when the `z`

in `term1`

is equal to the `y`

in `term2`

, the terms have the same result. Each term introduces its own environment variable namespace, so what we want is a projection into a shared namespace, in which we can reason about the equalities we desire.

We can simplify the problem a little by not changing the number of unique variables inside a term, only restricting ourselves to identifying (or not) variables across terms. Here are some possible projections of those terms:

f x1 y1 z1 =?= g x1 y2 z2 f x1 y1 z1 =?= g x2 z1 z2 f x1 y1 z1 =?= g x2 y2 z1 f x1 y1 z1 =?= g x2 y1 z2 and so on

So we want to produce a *set* of all such type-correct projections.

This memo is a literate Haskell file (though confusingly I mix source with examples, which are NOT syntax-highlighted), and we’ll need this:

```
{-# LANGUAGE LambdaCase #-}
```

`These`

typeI recently read a blog post entitled These, Align, and Crosswalk about safely zipping (or merging) data structures. The core of the blog post is the `These`

type, defined as so:

-- | The @These@ type is like 'Either', but also has the case for when we have both values. data These a b = This a | That b | These a b deriving Show

I don’t *think* I need anything else from the these package, so I’ll just inline that definition there. It turns out to be exactly the thing we need!

Firstly, we need a representation of terms, from which we can extract variables. We don’t actually need anything other than variable names and types, so let’s just go for that:

type Name = String type Type = Int type Term = [(Name, Type)]

Our challenge is to find two functions:

projections :: Term -> Term -> [[(These Name Name, Type)]] renaming :: (Type -> Name) -> [(These Name Name, Type)] -> ([(Name, Name)], [(Name, Name)])

Where a `These String String`

value represents a type-correct renaming of variables:

`This s`

means a variable from the left term is kept distinct from all variables in the right term.`That s`

means a variable from the right term is kept distinct from all variables in the left term.`These s1 s2`

means a variable from the left term is identified with a variable from the right term.

-- | Find all type-correct ways of associating variables. projections :: Term -> Term -> [[(These Name Name, Type)]] projections t1 [] = [[(This v, ty) | (v, ty) <- t1]] projections [] t2 = [[(That v, ty) | (v, ty) <- t2]] projections ((vL, tyL):t1) t2 = concat [map ((These vL vR, tyL) :) (projections t1 (filter (/=x) t2)) | x@(vR, tyR) <- t2, tyL == tyR] ++ map ((This vL, tyL) :) (projections t1 t2)

By appending the `concat`

to the `map`

we can instead generate in order from most general (most `This`

/`That`

usage) to least general (most `These`

usage).

Now that we have projections, we can produce consistent renamings:

-- | Given a projection into a common namespace, produce a consistent variable renaming. Variables -- of the same type, after the first, will have a number appended starting from 1. renaming :: (Type -> Name) -> [(These Name Name, Type)] -> ([(Name, Name)], [(Name, Name)]) renaming varf = go [] ([], []) where go e x ((these, ty):rest) = let name = varf ty in rename e x name (maybe 0 (+1) $ lookup name e) these rest go e x [] = x rename e ~(l, r) name n = let name' = if n == 0 then name else name ++ show n in \case This vL -> go ((name, n):e) ((vL, name'):l, r) That vR -> go ((name, n):e) (l, (vR, name'):r) These vL vR -> go ((name, n):e) ((vL, name'):l, (vR, name'):r)

The two steps can be combined:

-- | Find all consistent renamings of a pair of terms. renamings :: (Type -> Name) -> Term -> Term -> [([(Name, Name)], [(Name, Name)])] renamings varf t1 t2 = map (renaming varf) (projections t1 t2)

My only regret is that I found no use for the fancier functions in the these package.

]]>This memo is about exhaustively generating schemas. Let’s go!

{-# LANGUAGE KindSignatures #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE StandaloneDeriving #-} import Control.Monad (filterM) import Data.Function (on) import Data.IntMap (IntMap) import qualified Data.IntMap as M import Data.Maybe (maybeToList) import Data.Proxy (Proxy(..)) import Data.Semigroup (Semigroup, (<>)) import Data.Set (Set) import qualified Data.Set as S import Data.Typeable (Typeable, TypeRep, funResultTy, splitTyConApp, typeRep) import GHC.Exts (Any) import Unsafe.Coerce (unsafeCoerce)

To concretely tie down what we’re doing, we’re going to generate expressions in *size* order. Expression size here corresponds roughly to the number of nodes in the tree:

sizeOf :: Exp m h -> Int sizeOf (Lit _ _) = 1 sizeOf (Var _ _) = 1 sizeOf (Bind _ b e) = 1 + sizeOf b + sizeOf e sizeOf (Let _ b e) = 1 + sizeOf b + sizeOf e sizeOf (Ap _ f e) = sizeOf f + sizeOf e

In order to avoid duplicates, we’re going to want sets of expressions. We’re going to cheat a little because we can’t actually compare `BDynamic`

values, so we’ll just compare their types, and hope that that plus the string representations in `Lit`

will be enough to disambiguate.

instance Eq BDynamic where (==) = (==) `on` bdynTypeRep instance Ord BDynamic where compare = compare `on` bdynTypeRep deriving instance Eq h => Eq (Var h) deriving instance Ord h => Ord (Var h) deriving instance Eq h => Eq (Exp m h) deriving instance Ord h => Ord (Exp m h)

We’re going to keep quite a simple interface for our schema generator, we shall have:

- A type of generators, which is a map from size to generated schemas of that size.
- A function to get all the schemas of a size.
- A function to make a new generator from a collection of “primitive” schemas.
- A function to generate a size, assuming all smaller sizes have been generated.

-- | A generator of schemas, in size order. newtype Generator1 m = Generator1 { tiers1 :: IntMap (Set (Schema m)) } -- | Get all schemas of the given size, if generated. schemas1 :: Generator1 m -> Int -> Set (Schema m) schemas1 g i = M.findWithDefault (S.empty) i (tiers1 g)

**Creation**: to make a new generator, we’ll just plug the provided schemas into the appropriate tiers.

-- | Create a new generator from a set of initial schemas. create1 :: [Schema m] -> Generator1 m create1 initial = Generator1 $ M.unionsWith S.union [M.singleton (sizeOf s) (S.singleton s) | s <- initial]

**Generation**: now we see the benefit of all the smart `Maybe`

-returning constructors: we can do the incredibly naive thing of just trying all correctly-sized combinations of already-known schemas. The ones which return a `Just`

are good and shall be kept.

-- | Generate schemas of the given size, assuming all smaller tiers have been generated. generate1 :: (Applicative m, Typeable m) => Int -> Generator1 m -> Generator1 m generate1 i g = Generator1 $ M.unionsWith S.union [ tiers1 g , M.singleton i aps , M.singleton i binds , M.singleton i lets ] where -- sizeOf (ap f e) = 0 + sizeOf f + sizeOf e aps = makeTerms 0 $ \terms candidates -> [ new | f <- terms , e <- candidates , new <- maybeToList (ap f e) ] -- sizeOf (bind is b e) = 1 + sizeOf b + sizeOf e binds = makeTerms 1 $ \terms candidates -> [ new | b <- terms , e <- candidates , holeset <- powerset . map fst $ holes e , new <- maybeToList (bind holeset b e) ] -- sizeOf (let_ is b e) = 1 + sizeOf b + sizeOf e lets = makeTerms 1 $ \terms candidates -> [ new | b <- terms , e <- candidates , holeset <- powerset . map fst $ holes e , new <- maybeToList (let_ holeset b e) ] makeTerms n f = M.foldMapWithKey go (tiers1 g) where go tier terms = S.fromList $ let candidates = schemas1 g (i - tier - n) in f (S.toList terms) (S.toList candidates) powerset = filterM (const [False,True])

Here’s a small demo:

demo1 :: Generator1 IO demo1 = create1 [ hole $ typeRep (Proxy :: Proxy Int) , lit "0" . toBDynamic $ (0 :: Int) , lit "1" . toBDynamic $ (1 :: Int) , lit "+" . toBDynamic $ ((+) :: Int -> Int -> Int) , lit "*" . toBDynamic $ ((*) :: Int -> Int -> Int) , lit "pure" . toBDynamic $ (pure :: Int -> IO Int) ] λ> let upto n = mapM_ print . S.toList $ schemas1 (foldl (flip generate1) demo1 [1..n]) n λ> upto 1 * + 0 1 pure (_ :: Int) λ> upto 2 (pure) (0) (pure) (1) (pure) ((_ :: Int)) (*) (0) (*) (1) (*) ((_ :: Int)) (+) (0) (+) (1) (+) ((_ :: Int)) λ> upto 3 let <*> in <0> let <*> in <1> let <*> in <(_ :: Int)> let <+> in <0> let <+> in <1> let <+> in <(_ :: Int)> let <0> in <0> let <0> in <1> let <0> in <(_ :: Int)> let <0> in <(b0 :: Int)> let <1> in <0> let <1> in <1> let <1> in <(_ :: Int)> let <1> in <(b0 :: Int)> let <pure> in <0> let <pure> in <1> let <pure> in <(_ :: Int)> let <(_ :: Int)> in <0> let <(_ :: Int)> in <1> let <(_ :: Int)> in <(_ :: Int)> let <(_ :: Int)> in <(b0 :: Int)> let <*> in <*> let <*> in <+> let <+> in <*> let <+> in <+> let <0> in <*> let <0> in <+> let <1> in <*> let <1> in <+> let <pure> in <*> let <pure> in <+> let <(_ :: Int)> in <*> let <(_ :: Int)> in <+> let <*> in <pure> let <+> in <pure> let <0> in <pure> let <1> in <pure> let <pure> in <pure> let <(_ :: Int)> in <pure> ((*) (0)) (0) ((*) (0)) (1) ((*) (0)) ((_ :: Int)) ((*) (1)) (0) ((*) (1)) (1) ((*) (1)) ((_ :: Int)) ((*) ((_ :: Int))) (0) ((*) ((_ :: Int))) (1) ((*) ((_ :: Int))) ((_ :: Int)) ((+) (0)) (0) ((+) (0)) (1) ((+) (0)) ((_ :: Int)) ((+) (1)) (0) ((+) (1)) (1) ((+) (1)) ((_ :: Int)) ((+) ((_ :: Int))) (0) ((+) ((_ :: Int))) (1) ((+) ((_ :: Int))) ((_ :: Int))

The current generator is nice and simple, but produces some uninteresting terms:

- Let-bindings where the body has no holes bound:
`let <1> in <pure>`

- Let- and monadic- bindings where the binder is a hole:
`let <(_ :: Int)> in <(b0 :: Int)>`

Note that we do want to keep monadic bindings where the body has no holes bound, as the monadic bind may cause an interesting effect.

Furthermore, we may want to store additional information with the generated schemas, which we can use to prune generation further, and record information for further use.

-- | A generator of schemas with metadata, in size order. newtype Generator2 m ann = Generator2 { tiers2 :: IntMap (Set (Schema m, ann)) }

Generation is now a bit more involved:

-- | Generate schemas of the given size, assuming all smaller tiers have been generated. generate2 :: (Applicative m, Typeable m, Semigroup ann, Ord ann) => (ann -> ann -> Schema m -> Bool) -- ^ A predicate to filter generated schemas. -> Int -> Generator2 m ann -> Generator2 m ann generate2 annp i g = Generator2 $ M.unionsWith S.union [ tiers2 g , M.singleton i aps , M.singleton i binds , M.singleton i lets ] where aps = makeTerms 0 $ \terms candidates -> [ (new, fAnn <> eAnn) -- produce a new annotation by combining the old | (f, fAnn) <- terms , (e, eAnn) <- candidates , new <- maybeToList (ap f e) -- check the new expression and old annotations against the predicate , annp fAnn eAnn new ] binds = makeTerms 1 $ \terms candidates -> [ (new, bAnn <> eAnn) -- produce a new annotation by combining the old | (b, bAnn) <- terms -- don't allow a binder which is a hole , case b of Var _ (Hole _) -> False; _ -> True , (e, eAnn) <- candidates , holeset <- powerset . map fst $ holes e , new <- maybeToList (bind holeset b e) -- check the new expression and old annotations against the predicate , annp bAnn eAnn new ] lets = makeTerms 1 $ \terms candidates -> [ (new, bAnn <> eAnn) -- produce a new annotation by combining the old | (b, bAnn) <- terms -- don't allow a binder which is a hole , case b of Var _ (Hole _) -> False; _ -> True , (e, eAnn) <- candidates , holeset <- powerset . map fst $ holes e -- don't allow an empty holeset , not (null holeset) , new <- maybeToList (let_ holeset b e) -- check the new expression and old annotations against the predicate , annp bAnn eAnn new ] makeTerms n f = M.foldMapWithKey go (tiers2 g) where go tier terms = S.fromList $ let candidates = schemas2 g (i - tier - n) in f (S.toList terms) (S.toList candidates) powerset = filterM (const [False,True])

The `schemas`

and `create`

code are basically the same:

-- | Get all schemas of the given size, if generated. schemas2 :: Generator2 m ann -> Int -> Set (Schema m, ann) schemas2 g i = M.findWithDefault (S.empty) i (tiers2 g) -- | Create a new generator from a set of initial schemas. create2 :: Ord ann => [(Schema m, ann)] -> Generator2 m ann create2 initial = Generator2 $ M.unionsWith S.union [M.singleton (sizeOf e) (S.singleton s) | s@(e,_) <- initial]

Our demo now looks much better:

demo2 :: Generator2 IO () demo2 = create2 $ map (\e -> (e, ())) [ hole $ typeRep (Proxy :: Proxy Int) , lit "0" . toBDynamic $ (0 :: Int) , lit "1" . toBDynamic $ (1 :: Int) , lit "+" . toBDynamic $ ((+) :: Int -> Int -> Int) , lit "*" . toBDynamic $ ((*) :: Int -> Int -> Int) , lit "pure" . toBDynamic $ (pure :: Int -> IO Int) ] λ> let upto n = mapM_ print . S.toList $ schemas2 (foldl (flip $ generate2 \_ _ _ -> True) demo2 [1..n]) n λ> upto 1 ((*,()) (+,()) (0,()) (1,()) (pure,()) ((_ :: Int),()) λ> upto 2 ((pure) (0),()) ((pure) (1),()) ((pure) ((_ :: Int)),()) ((*) (0),()) ((*) (1),()) ((*) ((_ :: Int)),()) ((+) (0),()) ((+) (1),()) ((+) ((_ :: Int)),()) λ> upto 3 (let <0> in <(b0 :: Int)>,()) (let <1> in <(b0 :: Int)>,()) (((*) (0)) (0),()) (((*) (0)) (1),()) (((*) (0)) ((_ :: Int)),()) (((*) (1)) (0),()) (((*) (1)) (1),()) (((*) (1)) ((_ :: Int)),()) (((*) ((_ :: Int))) (0),()) (((*) ((_ :: Int))) (1),()) (((*) ((_ :: Int))) ((_ :: Int)),()) (((+) (0)) (0),()) (((+) (0)) (1),()) (((+) (0)) ((_ :: Int)),()) (((+) (1)) (0),()) (((+) (1)) (1),()) (((+) (1)) ((_ :: Int)),()) (((+) ((_ :: Int))) (0),()) (((+) ((_ :: Int))) (1),()) (((+) ((_ :: Int))) ((_ :: Int)),())

The Mark 3-ig expression types and smart constructors:

data BDynamic = BDynamic { bdynTypeRep :: TypeRep, bdynAny :: Any } toBDynamic :: forall a. Typeable a => a -> BDynamic toBDynamic a = BDynamic (typeRep (Proxy :: Proxy a)) (unsafeCoerce a) data Exp (m :: * -> *) (h :: *) = Lit String BDynamic | Var TypeRep (Var h) | Bind TypeRep (Exp m h) (Exp m h) | Let TypeRep (Exp m h) (Exp m h) | Ap TypeRep (Exp m h) (Exp m h) instance Show (Exp m h) where show (Lit s _) = s show (Var ty v) = "(" ++ show v ++ " :: " ++ show ty ++ ")" show (Bind _ b e) = "bind <" ++ show b ++ "> in <" ++ show e ++ ">" show (Let _ b e) = "let <" ++ show b ++ "> in <" ++ show e ++ ">" show (Ap _ f e) = "(" ++ show f ++ ") (" ++ show e ++ ")" data Var h = Hole h | Named String | Bound Int instance Show (Var h) where show (Hole _) = "_" show (Named s) = s show (Bound i) = 'b' : show i type Schema m = Exp m () data Ignore = Ignore deriving (Bounded, Enum, Eq, Ord, Read, Show) typeOf :: Exp m h -> TypeRep typeOf (Lit _ dyn) = bdynTypeRep dyn typeOf (Var ty _) = ty typeOf (Bind ty _ _) = ty typeOf (Let ty _ _) = ty typeOf (Ap ty _ _) = ty lit :: String -> BDynamic -> Exp m h lit = Lit hole :: TypeRep -> Schema m hole ty = Var ty (Hole ()) let_ :: [Int] -> Schema m -> Schema m -> Maybe (Schema m) let_ is b e0 = Let (typeOf e0) b <$> letOrBind is (typeOf b) e0 bind :: forall m. Typeable m => [Int] -> Schema m -> Schema m -> Maybe (Schema m) bind is b e0 = case (splitTyConApp (typeOf b), splitTyConApp (typeOf e0)) of ((btyCon, btyArgs), (etyCon, etyArgs)) | btyCon == mtyCon && btyCon == etyCon && not (null btyArgs) && not (null etyArgs) && mtyArgs == init btyArgs && init btyArgs == init etyArgs -> Bind (typeOf e0) b <$> letOrBind is (last btyArgs) e0 _ -> Nothing where (mtyCon, mtyArgs) = splitTyConApp (typeRep (Proxy :: Proxy m)) ap :: forall m h. (Applicative m, Typeable m) => Exp m h -> Exp m h -> Maybe (Exp m h) ap f e = case (splitTyConApp (typeOf f), splitTyConApp (typeOf e)) of ((_, [fargTy,fresTy]), (etyCon, etyArgs)) | fargTy == ignoreTy && etyCon == mtyCon && not (null etyArgs) && mtyArgs == init etyArgs -> Just (Ap fresTy f e) | otherwise -> (\ty -> Ap ty f e) <$> typeOf f `funResultTy` typeOf e _ -> Nothing where ignoreTy = typeRep (Proxy :: Proxy (m Ignore)) (mtyCon, mtyArgs) = splitTyConApp (typeRep (Proxy :: Proxy m)) letOrBind :: [Int] -> TypeRep -> Exp m h -> Maybe (Exp m h) letOrBind is boundTy e0 = fst <$> go 0 0 e0 where go n i (Var ty (Hole h)) | i `elem` is = if boundTy == ty then Just (Var ty (Bound n), i + 1) else Nothing | otherwise = Just (Var ty (Hole h), i + 1) go n i (Bind ty b e) = do (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Bind ty b' e', i'') go n i (Let ty b e) = do (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Let ty b' e', i'') go n i (Ap ty f e) = do (f', i') <- go n i f (e', i'') <- go n i' e Just (Ap ty f' e', i'') go _ i e = Just (e, i) holes :: Schema m -> [(Int, TypeRep)] holes = fst . go 0 where go i (Var ty (Hole _)) = ([(i, ty)], i + 1) go i (Let _ b e) = let (bhs, i') = go i b (ehs, i'') = go i' e in (bhs ++ ehs, i'') go i (Ap _ f e) = let (fhs, i') = go i f (ehs, i'') = go i' e in (fhs ++ ehs, i'') go i _ = ([], i)]]>

This is a simplified version of a problem I’ve been having in CoCo. The current handling of variables in CoCo is very poor, the programmer has to specify exactly which variables may be introduced, binds and lets cause shadowing, and no care is taken to avoid generating alpha equivalent terms.

Here are two representative problems:

If you have the variable

`x`

, and no others, a term like this will not be generated:`f >>= \x1 -> g x1 x`

If you have the variables

`x`

and`y`

of the same type, these equivalent terms will be generated:`f x`

and`f y`

Let’s get started…

{-# LANGUAGE KindSignatures #-} {-# LANGUAGE ScopedTypeVariables #-} import Data.Dynamic (Dynamic, dynApply, dynTypeRep) import Data.Function (on) import Data.List (groupBy, nub, sortOn) import Data.Maybe (mapMaybe, maybeToList) import Data.Ord (Down(..)) import Data.Proxy (Proxy(..)) import Data.Typeable (Typeable, TypeRep, funResultTy, splitTyConApp, typeOf, typeRep) import Data.Void (Void, absurd) import GHC.Exts (Any) import Unsafe.Coerce (unsafeCoerce)

Our expressions are representations of Haskell code, which makes this a bit unlike most toy expression evaluators you see in tutorials. We want everything to be very typed, and not expose the constructors of the expression data type, using smart constructors to ensure that only well-typed expressions can be created.

It’s this typing that makes it really difficult to use the bound library here, as bound doesn’t play so nicely with typed variables. You can infer types for variables later, but I don’t really want to implement that.

-- | A type for expressions. This would not be exported by actual library code, to prevent the user -- from mucking around with the types. data Exp1 = Lit1 String Dynamic -- ^ Literal values are just dynamically-wrapped Haskell values. | Hole1 TypeRep -- ^ "Holes" are free variables, identified by their position in the tree. | Named1 TypeRep String -- ^ Variables which are bound by a provided environment. More on this later. | Bound1 TypeRep Int -- ^ Variables which are bound by a let, with de Bruijn indices. All variables bound to the same -- let must have the correct type. | Let1 TypeRep Exp1 Exp1 -- ^ The binder is assumed to be of the correct type for all the variables bound to it. | Ap1 TypeRep Exp1 Exp1 -- ^ The parameter is assumed to be of the correct type for the function. instance Show Exp1 where show (Lit1 s _) = s show (Hole1 ty) = "(_ :: " ++ show ty ++ ")" show (Named1 ty s) = "(" ++ s ++ " :: " ++ show ty ++ ")" show (Bound1 ty i) = "(" ++ show i ++ " :: " ++ show ty ++ ")" show (Let1 _ b e) = "let <" ++ show b ++ "> in <" ++ show e ++ ">" show (Ap1 _ f e) = "(" ++ show f ++ ") (" ++ show e ++ ")" -- | Get the type of an expression. typeOf1 :: Exp1 -> TypeRep typeOf1 (Lit1 _ dyn) = dynTypeRep dyn typeOf1 (Hole1 ty) = ty typeOf1 (Named1 ty _) = ty typeOf1 (Bound1 ty _) = ty typeOf1 (Let1 ty _ _) = ty typeOf1 (Ap1 ty _ _) = ty -- | Get all the holes in an expression, identified by position. holes1 :: Exp1 -> [(Int, TypeRep)] holes1 = fst . go 0 where go i (Hole1 ty) = ([(i, ty)], i + 1) go i (Let1 _ b e) = let (bhs, i') = go i b (ehs, i'') = go i' e in (bhs ++ ehs, i'') go i (Ap1 _ f e) = let (fhs, i') = go i f (ehs, i'') = go i' e in (fhs ++ ehs, i'') go i _ = ([], i) -- | Get all the named variables in an expression. names1 :: Exp1 -> [(String, TypeRep)] names1 = nub . go where go (Named1 ty s) = [(s, ty)] go (Let1 _ b e) = go b ++ go e go (Ap1 _ f e) = go f ++ go e go _ = []

**Smart constructors**: we only want to be able to produce type-correct expressions, for two reasons: the evaluator in CoCo is more complex and does a lot of unsafe coercion, and being able to just call `error`

when the types don’t work out is nicer than needing to actually handle it; and it makes it easier to generate terms programmatically, as you simply try all possibilities and keep the ones which succeed.

-- | Construct a literal value. lit1 :: String -> Dynamic -> Exp1 lit1 = Lit1 -- | Construct a typed hole. hole1 :: TypeRep -> Exp1 hole1 = Hole1 -- | Perform a function application, if type-correct. ap1 :: Exp1 -> Exp1 -> Maybe Exp1 ap1 f e = (\ty -> Ap1 ty f e) <$> typeOf1 f `funResultTy` typeOf1 e -- | Bind a collection of holes, if type-correct. -- -- The binding is applied "atomically", in that you don't need to worry about holes disappearing and -- so changing their position-indices while this operation happens; however the position of unbound -- holes may be altered in the result of this function. let1 :: [Int] -> Exp1 -> Exp1 -> Maybe Exp1 let1 is b e0 = Let1 (typeOf1 e0) b . fst <$> go 0 0 e0 where go n i (Hole1 ty) | i `elem` is = if typeOf1 b == ty then Just (Bound1 ty n, i + 1) else Nothing | otherwise = Just (Hole1 ty, i + 1) go n i (Let1 ty b e) = do (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Let1 ty b' e', i'') go n i (Ap1 ty f e) = do (f', i') <- go n i f (e', i'') <- go n i' e Just (Ap1 ty f' e', i'') go _ i e = Just (e, i) -- | Give names to holes, if type-correct. -- -- This has the same indexing behaviour as 'let1'. name1 :: [(Int, String)] -> Exp1 -> Maybe Exp1 name1 is e0 = (\(e,_,_) -> e) <$> go [] 0 e0 where go env i n@(Named1 ty s) = case lookup s env of -- if a name gets re-used it had better be at the same type! Just sty | ty == sty -> Just (n, env, i) | otherwise -> Nothing Nothing -> Just (n, env, i) go env i (Hole1 ty) = case lookup i is of Just s -> case lookup s env of Just sty | ty == sty -> Just (Named1 ty s, env, i + 1) | otherwise -> Nothing Nothing -> Just (Named1 ty s, (s,ty):env, i + 1) Nothing -> Just (Hole1 ty, env, i + 1) go env i (Let1 ty b e) = do (b', env', i') <- go env i b (e', env'', i'') <- go env' i' e Just (Let1 ty b' e', env'', i'') go env i (Ap1 ty f e) = do (f', env', i') <- go env i f (e', env'', i'') <- go env' i' e Just (Ap1 ty f' e', env'', i'') go env i e = Just (e, env, i)

**Evaluation**: now we have everything in place to evaluate expressions with no unbound holes. Everything is type-correct-by-construction, so if there are no holes (and the global environment has everything we need) we can get out a value.

-- | Evaluate an expression if it has no holes. eval1 :: [(String, Dynamic)] -> Exp1 -> Maybe Dynamic eval1 globals = go [] where -- the local environment is a list of values, with each new scope prepending a value; this means -- that the de Bruijn indices correspond with list indices go locals (Let1 _ b e) = (\dyn -> go (dyn:locals) e) =<< go locals b go locals (Bound1 _ n) = Just (locals !! n) -- named variables come from the global environment go _ (Named1 ty s) = case lookup s globals of Just dyn | dynTypeRep dyn == ty -> Just dyn _ -> Nothing -- the other operations don't care about either environment go locals (Ap1 _ f e) = do dynf <- go locals f dyne <- go locals e dynf `dynApply` dyne go _ (Lit1 _ dyn) = Just dyn go _ (Hole1 _) = Nothing

**Removing Holes**: we still have one more problem, it would be nice for holes to be given names automatically, and not just individual holes, but groups of holes too.

For example, if we have an expression like so:

f (_ :: Int) (_ :: Bool) (_ :: Bool) (_ :: Int)

It would be nice to be able to generate these expressions automatically:

f (w :: Int) (x :: Bool) (y :: Bool) (z :: Int) f (w :: Int) (x :: Bool) (y :: Bool) (w :: Int) f (w :: Int) (x :: Bool) (x :: Bool) (z :: Int) f (w :: Int) (x :: Bool) (x :: Bool) (w :: Int)

It would be particularly nice if they were generated in a list in that order, from most free to most constrained.

-- | From an expression that may have holes, generate a list of expressions with named variables -- substituted instead, ordered from most free (one hole per variable) to most constrained (multiple -- holes per variable). -- -- This takes a function to assign a letter to each type, subsequent variables of the same type have -- digits appended. terms1 :: (TypeRep -> Char) -> Exp1 -> [Exp1] terms1 nf = sortOn (Down . length . names1) . go where go e0 = case hs e0 of [] -> [e0] (chosen:_) -> concatMap go [ e | ps <- partitions chosen , let (((_,tyc):_):_) = ps , let vname i = if i == 0 then [nf tyc] else nf tyc : show i , let naming = concat $ zipWith (\i vs -> [(v, vname i) | (v,_) <- vs]) [0..] ps , e <- maybeToList (name1 naming e0) ] -- holes grouped by type hs = groupBy ((==) `on` snd) . sortOn snd . holes1 -- all partitions of a list partitions (x:xs) = [[x]:p | p <- partitions xs] ++ [(x:ys):yss | (ys:yss) <- partitions xs] partitions [] = [[]]

Here’s an example from ghci:

λ> let intHole = hole1 $ T.typeOf (5::Int) λ> let boolHole = hole1 $ T.typeOf True λ> let ibf = lit1 "f" (D.toDyn ((\_ _ _ a -> a) :: Int -> Bool -> Bool -> Int -> Int)) λ> let ibfExp = fromJust $ do { x <- ibf `ap1` intHole; y <- x `ap1` boolHole; z <- y `ap1` boolHole; z `ap1` intHole } λ> mapM_ print $ terms1 (head.show) ibfExp ((((f) ((I :: Int))) ((B :: Bool))) ((B1 :: Bool))) ((I1 :: Int)) ((((f) ((I :: Int))) ((B :: Bool))) ((B1 :: Bool))) ((I :: Int)) ((((f) ((I :: Int))) ((B :: Bool))) ((B :: Bool))) ((I1 :: Int)) ((((f) ((I :: Int))) ((B :: Bool))) ((B :: Bool))) ((I :: Int))

Pretty sweet!

What we have now is pretty good, but it leaves a little to be desired: it would be nice to be able to statically forbid passing expressions with holes to `eval`

. As always in Haskell, the solution is to add another type parameter.

data Exp2 h = Lit2 String Dynamic | Var2 TypeRep (Var2 h) -- ^ One constructor for holes, named, and bound variables. | Let2 TypeRep (Exp2 h) (Exp2 h) | Ap2 TypeRep (Exp2 h) (Exp2 h) instance Show (Exp2 h) where show (Lit2 s _) = s show (Var2 ty v) = "(" ++ show v ++ " :: " ++ show ty ++ ")" show (Let2 _ b e) = "let <" ++ show b ++ "> in <" ++ show e ++ ">" show (Ap2 _ f e) = "(" ++ show f ++ ") (" ++ show e ++ ")" data Var2 h = Hole2 h -- ^ Holes get a typed tag. | Named2 String -- ^ Environment variables. | Bound2 Int -- ^ Let-bound variables. instance Show (Var2 h) where show (Hole2 _) = "_" show (Named2 s) = s show (Bound2 i) = show i

**Schemas and Terms**: what does this hole tag buy us? Well, actually, it lets us very simply forbid the presence of holes! Constructing an `h`

value is required to construct a hole, so if we set it to `Void`

, then no holes can be made at all! If the tag is some inhabited type, then an expression may contain holes (but may not).

Let’s introduce two type synonyms to talk about these:

-- | A schema is an expression which may contain holes. A single schema may correspond to many -- terms. type Schema2 = Exp2 () -- | A term is an expression with no holes. Many terms may correspond to a single schema. type Term2 = Exp2 Void -- | Convert a Schema into a Term if there are no holes. toTerm2 :: Schema2 -> Maybe Term2 toTerm2 (Lit2 s dyn) = Just (Lit2 s dyn) toTerm2 (Var2 ty v) = case v of Hole2 _ -> Nothing Named2 s -> Just (Var2 ty (Named2 s)) Bound2 i -> Just (Var2 ty (Bound2 i)) toTerm2 (Let2 ty b e) = Let2 ty <$> toTerm2 b <*> toTerm2 e toTerm2 (Ap2 ty f e) = Ap2 ty <$> toTerm2 f <*> toTerm2 e

**Evaluation & Hole Removal**: now we can evaluate *terms* after removing holes from *schemas*. Statically-checked guarantees that we’re dealing with all of our holes properly, nice!

-- | Evaluate a term eval2 :: [(String, Dynamic)] -> Term2 -> Maybe Dynamic eval2 globals = go [] where go locals (Let2 _ b e) = (\dyn -> go (dyn:locals) e) =<< go locals b go locals v@(Var2 _ _) = env locals v go locals (Ap2 _ f e) = do dynf <- go locals f dyne <- go locals e dynf `dynApply` dyne go _ (Lit2 _ dyn) = Just dyn env locals (Var2 _ (Bound2 n)) | length locals > n = Just (locals !! n) | otherwise = Nothing env _ (Var2 ty (Named2 s)) = case lookup s globals of Just dyn | dynTypeRep dyn == ty -> Just dyn _ -> Nothing env _ (Var2 _ (Hole2 v)) = absurd v -- this is actually unreachable now -- | From a schema that may have holes, generate a list of terms with named variables -- substituted instead. terms2 :: (TypeRep -> Char) -> Schema2 -> [Term2] terms2 nf = mapMaybe toTerm2 . sortOn (Down . length . names2) . go where go e0 = case hs e0 of [] -> [e0] (chosen:_) -> concatMap go [ e | ps <- partitions chosen , let (((_,tyc):_):_) = ps , let vname i = if i == 0 then [nf tyc] else nf tyc : show i , let naming = concat $ zipWith (\i vs -> [(v, vname i) | (v,_) <- vs]) [0..] ps , e <- maybeToList (name2 naming e0) ] -- holes grouped by type hs = groupBy ((==) `on` snd) . sortOn snd . holes2 -- all partitions of a list partitions (x:xs) = [[x]:p | p <- partitions xs] ++ [(x:ys):yss | (ys:yss) <- partitions xs] partitions [] = [[]]

The rest of the code hasn’t changed much, but is included for completeness:

-- | Get the type of an expression. typeOf2 :: Exp2 h -> TypeRep typeOf2 (Lit2 _ dyn) = dynTypeRep dyn typeOf2 (Var2 ty _) = ty typeOf2 (Let2 ty _ _) = ty typeOf2 (Ap2 ty _ _) = ty -- | Get all the holes in an expression, identified by position. holes2 :: Schema2 -> [(Int, TypeRep)] holes2 = fst . go 0 where go i (Var2 ty (Hole2 _)) = ([(i, ty)], i + 1) -- tag is ignored go i (Let2 _ b e) = let (bhs, i') = go i b (ehs, i'') = go i' e in (bhs ++ ehs, i'') go i (Ap2 _ f e) = let (fhs, i') = go i f (ehs, i'') = go i' e in (fhs ++ ehs, i'') go i _ = ([], i) -- | Get all the named variables in an expression. names2 :: Exp2 h -> [(String, TypeRep)] names2 = nub . go where go (Var2 ty (Named2 s)) = [(s, ty)] go (Let2 _ b e) = go b ++ go e go (Ap2 _ f e) = go f ++ go e go _ = [] -- | Construct a literal value. lit2 :: String -> Dynamic -> Exp2 h lit2 = Lit2 -- | Construct a typed hole. hole2 :: TypeRep -> Schema2 hole2 ty = Var2 ty (Hole2 ()) -- holes get tagged with unit -- | Perform a function application, if type-correct. ap2 :: Exp2 h -> Exp2 h -> Maybe (Exp2 h) ap2 f e = (\ty -> Ap2 ty f e) <$> typeOf2 f `funResultTy` typeOf2 e -- | Bind a collection of holes, if type-correct. let2 :: [Int] -> Schema2 -> Schema2 -> Maybe Schema2 let2 is b e0 = Let2 (typeOf2 e0) b . fst <$> go 0 0 e0 where go n i (Var2 ty (Hole2 h)) | i `elem` is = if typeOf2 b == ty then Just (Var2 ty (Bound2 n), i + 1) else Nothing -- tag is ignored | otherwise = Just (Var2 ty (Hole2 h), i + 1) -- tag is preserved go n i (Let2 ty b e) = do (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Let2 ty b' e', i'') go n i (Ap2 ty f e) = do (f', i') <- go n i f (e', i'') <- go n i' e Just (Ap2 ty f' e', i'') go _ i e = Just (e, i) -- | Give names to holes, if type-correct. name2 :: [(Int, String)] -> Schema2 -> Maybe Schema2 name2 is e0 = (\(e,_,_) -> e) <$> go [] 0 e0 where go env i n@(Var2 ty (Named2 s)) = case lookup s env of Just sty | ty == sty -> Just (n, env, i) | otherwise -> Nothing Nothing -> Just (n, env, i) go env i (Var2 ty (Hole2 h)) = case lookup i is of Just s -> case lookup s env of Just sty | ty == sty -> Just (Var2 ty (Named2 s), env, i + 1) -- tag is ignored | otherwise -> Nothing Nothing -> Just (Var2 ty (Named2 s), (s,ty):env, i + 1) -- tag is ignored Nothing -> Just (Var2 ty (Hole2 h), env, i + 1) -- tag is preserved go env i (Let2 ty b e) = do (b', env', i') <- go env i b (e', env'', i'') <- go env' i' e Just (Let2 ty b' e', env'', i'') go env i (Ap2 ty f e) = do (f', env', i') <- go env i f (e', env'', i'') <- go env' i' e Just (Ap2 ty f' e', env'', i'') go env i e = Just (e, env, i)

In the Mark 3 evaluator, we’re going to need a function of type `Monad m => m Dynamic -> Dynamic`

, which “pushes” the `m`

inside the `Dynamic`

, and of type `Monad m => Dynamic -> Maybe (m Dynamic)`

. Unfortunately, Data.Dynamic doesn’t provide a way to do this, for good reason: it’s impossible in general! There’s no way to know what the type of the dynamic value inside the monad is, so there’s no way to do this safely.

Fortunately, implementing a Data.Dynamic-lite is pretty simple.

-- | A dynamic value is a pair of its type and 'Any'. Any is a magical type which is guaranteed to -- | work with 'unsafeCoerce'. data BDynamic = BDynamic { bdynTypeRep :: TypeRep, bdynAny :: Any } instance Show BDynamic where show = show . bdynTypeRep

(`BDynamic`

for “barrucadu’s dynamic”)

We need to be able to construct and deconstruct dynamic values, these operations do use `unsafeCoerce`

, but are safe:

-- | Convert an arbitrary value into a dynamic one. toBDynamic :: Typeable a => a -> BDynamic toBDynamic a = BDynamic (typeOf a) (unsafeCoerce a) -- | Convert a dynamic value into an ordinary value, if the types match. fromBDynamic :: Typeable a => BDynamic -> Maybe a fromBDynamic (BDynamic ty v) = case unsafeCoerce v of -- this is a bit mind-bending, but the 'typeOf r' here is the type of the 'a', as 'unsafeCoerce -- v :: a' (regardless of whether it actually is an 'a' value or not). The same result could be -- achieved using ScopedTypeVariables and 'typeRep'. r | ty == typeOf r -> Just r | otherwise -> Nothing

The final operation needed for the Marks 1 and 2 implementation is function application:

-- | Dynamically-typed function application. bdynApply :: BDynamic -> BDynamic -> Maybe BDynamic bdynApply (BDynamic ty1 f) (BDynamic ty2 x) = case funResultTy ty1 ty2 of Just ty3 -> Just (BDynamic ty3 ((unsafeCoerce f) x)) Nothing -> Nothing

Now we can construct our strange monad-shuffling operations:

-- | "Push" a functor inside a dynamic value, given the type of the resultant value. -- -- This is unsafe because if the type is incorrect and the value is later used as that type, good -- luck. unsafeWrapFunctor :: Functor f => TypeRep -> f BDynamic -> BDynamic unsafeWrapFunctor ty fdyn = BDynamic ty (unsafeCoerce $ fmap bdynAny fdyn) -- | "Extract" a functor from a dynamic value. unwrapFunctor :: forall f. (Functor f, Typeable f) => BDynamic -> Maybe (f BDynamic) unwrapFunctor (BDynamic ty v) = case splitTyConApp ty of (tyCon, tyArgs) | tyCon == ftyCon && not (null tyArgs) && init tyArgs == ftyArgs -> Just $ BDynamic (last tyArgs) <$> unsafeCoerce v _ -> Nothing where (ftyCon, ftyArgs) = splitTyConApp (typeRep (Proxy :: Proxy f))

It’s almost a shame that Data.Dynamic doesn’t expose enough to implement this. It has gone for a safe but limited API. A common Haskell “design” pattern is to have safe public APIs and unsafe but publically-exposed “internal” APIs, but base doesn’t seem to follow that.

This expression representation is pretty nice, but it’s rather cumbersome to express monadic operations for a couple of reasons:

Everything is monomorphic, so there would need to be a separate

`lit`

for`>>=`

at every desired type.Due to function application having a

`Maybe`

result, even at a single type writing`ap2 (lit2 ((>>=) :: Type)) e1 >>= \f -> ap2 f e2`

is not nice.The original need for this expression representation was for generating Haskell terms, and generating lambda terms is tricky; it would be nice to be able to bind holes directly.

This calls for a third representation of expressions. For reasons that will become apparent when looking at the evaluator, we’ll specalise this to only working in one monad, and track the monad type as another parameter of `Exp`

:

data Exp3 (m :: * -> *) (h :: *) = Lit3 String BDynamic | Var3 TypeRep (Var3 h) | Bind3 TypeRep (Exp3 m h) (Exp3 m h) | Let3 TypeRep (Exp3 m h) (Exp3 m h) | Ap3 TypeRep (Exp3 m h) (Exp3 m h) instance Show (Exp3 m h) where show (Lit3 s _) = s show (Var3 ty v) = "(" ++ show v ++ " :: " ++ show ty ++ ")" show (Bind3 _ b e) = "bind <" ++ show b ++ "> in <" ++ show e ++ ">" show (Let3 _ b e) = "let <" ++ show b ++ "> in <" ++ show e ++ ">" show (Ap3 _ f e) = "(" ++ show f ++ ") (" ++ show e ++ ")"

**Construction**: bind is going to be treated just as a let with special evaluation rules. This means that de Bruijn indices will be able to refer to a bind or a let. Rather than have two separate counters for those, we’ll just put everything in the same namespace (index-space?).

-- | Bind a collection of holes, if type-correct. let3 :: [Int] -> Schema3 m -> Schema3 m -> Maybe (Schema3 m) let3 is b e0 = Let3 (typeOf3 e0) b <$> letOrBind3 is (typeOf3 b) e0 -- | Monadically bind a collection of holes, if type-correct. -- -- This has the same indexing behaviour as 'let3'. bind3 :: forall m. Typeable m => [Int] -> Schema3 m -> Schema3 m -> Maybe (Schema3 m) bind3 is b e0 = case (splitTyConApp (typeOf3 b), splitTyConApp (typeOf3 e0)) of ((btyCon, btyArgs), (etyCon, etyArgs)) | btyCon == mtyCon && btyCon == etyCon && not (null btyArgs) && not (null etyArgs) && mtyArgs == init btyArgs && init btyArgs == init etyArgs -> Bind3 (typeOf3 e0) b <$> letOrBind3 is (last btyArgs) e0 _ -> Nothing where (mtyCon, mtyArgs) = splitTyConApp (typeRep (Proxy :: Proxy m)) -- | A helper for 'bind3' and 'let3': bind holes to the top of the expression. letOrBind3 :: [Int] -> TypeRep -> Exp3 m h -> Maybe (Exp3 m h) letOrBind3 is boundTy e0 = fst <$> go 0 0 e0 where go n i (Var3 ty (Hole3 h)) | i `elem` is = if boundTy == ty then Just (Var3 ty (Bound3 n), i + 1) else Nothing | otherwise = Just (Var3 ty (Hole3 h), i + 1) go n i (Bind3 ty b e) = do -- a new case for Bind3, the same as the case for Let3 (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Bind3 ty b' e', i'') go n i (Let3 ty b e) = do (b', i') <- go n i b (e', i'') <- go (n+1) i' e Just (Let3 ty b' e', i'') go n i (Ap3 ty f e) = do (f', i') <- go n i f (e', i'') <- go n i' e Just (Ap3 ty f' e', i'') go _ i e = Just (e, i)

We could make `letOrBind3`

also work for `name3`

by carrying around a third state token and letting the `Var3`

case apply an arbitrary function.

**Evaluation**: the new bind case, unfortunately, complicates things somewhat here. It’s *much* more awkward to deal with errors during evaluation, but fortunately the only errors that can actually arise are unbound named variables: the smart constructors ensure expressions are well-typed and have valid de Bruijn indices. This means we can just check the named variables for validity up front and then use `error`

once we’re sure there actually are no errors.

-- | Evaluate a term eval3 :: forall m. (Monad m, Typeable m) => [(String, BDynamic)] -> Term3 m -> Maybe BDynamic eval3 globals e0 | all check (names3 e0) = Just (go [] e0) | otherwise = Nothing where go locals (Bind3 ty b e) = case (unwrapFunctor :: BDynamic -> Maybe (m BDynamic)) (go locals b) of Just mdyn -> unsafeWrapFunctor ty $ mdyn >>= \dyn -> case unwrapFunctor (go (dyn:locals) e) of Just dyn -> dyn Nothing -> error "type error I can't deal with here!" -- this is unreachable Nothing -> error "type error I can't deal with here!" -- this is unreachable go locals (Let3 _ b e) = go (go locals b : locals) e go locals v@(Var3 _ _) = case env locals v of Just dyn -> dyn Nothing -> error "environment error I can't deal with here!" -- this is unreachable go locals (Ap3 _ f e) = case go locals f `bdynApply` go locals e of Just dyn -> dyn Nothing -> error "type error I can't deal with here!" -- this is unreachable go _ (Lit3 _ dyn) = dyn env locals (Var3 _ (Bound3 n)) | length locals > n = Just (locals !! n) | otherwise = Nothing env _ (Var3 ty (Named3 s)) = case lookup s globals of Just dyn | bdynTypeRep dyn == ty -> Just dyn _ -> Nothing env _ (Var3 _ (Hole3 v)) = absurd v check (s, ty) = case lookup s globals of Just dyn -> bdynTypeRep dyn == ty Nothing -> False

Now it becomes apparent why the monad type parameter is needed in the expression type, the evaluator uses `>>=`

, and so it needs to know which monad to bind it as. An alternative would be to use a type like this, but this still restricts you to using a single monad and so doesn’t gain anything:

eval3alt :: (Monad m, Typeable m) => proxy m -> [(String, BDynamic)] -> Term3 -> Maybe BDynamic

Here’s a little example showing that side-effects do work (when I first did this, they didn’t, so it’s not quite trivial to get right):

λ> r <- newIORef (5::Int) λ> let intHole = hole3 $ T.typeOf (5::Int) λ> let plusLit = lit3 "+" . toBDynamic $ ((+) :: Int -> Int -> Int) λ> let plusTwo = fromJust $ (fromJust $ plusLit `ap3` intHole) `ap3` intHole λ> let pureInt = lit3 "pure" . toBDynamic $ (pure :: Int -> IO Int) λ> let plusTwoIO = fromJust $ pureInt `ap3` plusTwo λ> let intAndTimes = (lit3 "*2" . toBDynamic $ (modifyIORef r (*7) >> pure (7::Int))) :: Exp3 IO h λ> let eval = fromJust $ (fromBDynamic :: BDynamic -> Maybe (IO Int)) =<< eval3 [] =<< toTerm3 =<< bind3 [0,1] intAndTimes plusTwoIO λ> eval 14 λ> readIORef r 7 λ> eval 14 λ> readIORef r 49

Again, the rest of the code hasn’t changed much, but is included for completeness.

data Var3 h = Hole3 h | Named3 String | Bound3 Int instance Show (Var3 h) where show (Hole3 _) = "_" show (Named3 s) = s show (Bound3 i) = show i -- | A schema is an expression which may contain holes. type Schema3 m = Exp3 m () -- | A term is an expression with no holes. type Term3 m = Exp3 m Void -- | Convert a Schema into a Term if there are no holes. toTerm3 :: Schema3 m -> Maybe (Term3 m) toTerm3 (Lit3 s dyn) = Just (Lit3 s dyn) toTerm3 (Var3 ty v) = case v of Hole3 _ -> Nothing Named3 s -> Just (Var3 ty (Named3 s)) Bound3 i -> Just (Var3 ty (Bound3 i)) toTerm3 (Bind3 ty b e) = Bind3 ty <$> toTerm3 b <*> toTerm3 e toTerm3 (Let3 ty b e) = Let3 ty <$> toTerm3 b <*> toTerm3 e toTerm3 (Ap3 ty f e) = Ap3 ty <$> toTerm3 f <*> toTerm3 e -- | Get the type of an expression. typeOf3 :: Exp3 m h -> TypeRep typeOf3 (Lit3 _ dyn) = bdynTypeRep dyn typeOf3 (Var3 ty _) = ty typeOf3 (Bind3 ty _ _) = ty typeOf3 (Let3 ty _ _) = ty typeOf3 (Ap3 ty _ _) = ty -- | Get all the holes in an expression, identified by position. holes3 :: Schema3 m -> [(Int, TypeRep)] holes3 = fst . go 0 where go i (Var3 ty (Hole3 _)) = ([(i, ty)], i + 1) -- tag is ignored go i (Let3 _ b e) = let (bhs, i') = go i b (ehs, i'') = go i' e in (bhs ++ ehs, i'') go i (Ap3 _ f e) = let (fhs, i') = go i f (ehs, i'') = go i' e in (fhs ++ ehs, i'') go i _ = ([], i) -- | Get all the named variables in an expression. names3 :: Exp3 m h -> [(String, TypeRep)] names3 = nub . go where go (Var3 ty (Named3 s)) = [(s, ty)] go (Let3 _ b e) = go b ++ go e go (Ap3 _ f e) = go f ++ go e go _ = [] -- | Construct a literal value. lit3 :: String -> BDynamic -> Exp3 m h lit3 = Lit3 -- | Construct a typed hole. hole3 :: TypeRep -> Schema3 m hole3 ty = Var3 ty (Hole3 ()) -- | Perform a function application, if type-correct. ap3 :: Exp3 m h -> Exp3 m h -> Maybe (Exp3 m h) ap3 f e = (\ty -> Ap3 ty f e) <$> typeOf3 f `funResultTy` typeOf3 e -- | Give names to holes, if type-correct. name3 :: [(Int, String)] -> Schema3 m -> Maybe (Schema3 m) name3 is e0 = (\(e,_,_) -> e) <$> go [] 0 e0 where go env i n@(Var3 ty (Named3 s)) = case lookup s env of Just sty | ty == sty -> Just (n, env, i) | otherwise -> Nothing Nothing -> Just (n, env, i) go env i (Var3 ty (Hole3 h)) = case lookup i is of Just s -> case lookup s env of Just sty | ty == sty -> Just (Var3 ty (Named3 s), env, i + 1) | otherwise -> Nothing Nothing -> Just (Var3 ty (Named3 s), (s,ty):env, i + 1) Nothing -> Just (Var3 ty (Hole3 h), env, i + 1) go env i (Bind3 ty b e) = do -- a new case for Bind3, the same as the case for Let3 (b', env', i') <- go env i b (e', env'', i'') <- go env' i' e Just (Bind3 ty b' e', env'', i'') go env i (Let3 ty b e) = do (b', env', i') <- go env i b (e', env'', i'') <- go env' i' e Just (Let3 ty b' e', env'', i'') go env i (Ap3 ty f e) = do (f', env', i') <- go env i f (e', env'', i'') <- go env' i' e Just (Ap3 ty f' e', env'', i'') go env i e = Just (e, env, i) -- | From a schema that may have holes, generate a list of terms with named variables -- substituted instead. terms3 :: (TypeRep -> Char) -> Schema3 m -> [Term3 m] terms3 nf = mapMaybe toTerm3 . sortOn (Down . length . names3) . go where go e0 = case hs e0 of [] -> [e0] (chosen:_) -> concatMap go [ e | ps <- partitions chosen , let (((_,tyc):_):_) = ps , let vname i = if i == 0 then [nf tyc] else nf tyc : show i , let naming = concat $ zipWith (\i vs -> [(v, vname i) | (v,_) <- vs]) [0..] ps , e <- maybeToList (name3 naming e0) ] -- holes grouped by type hs = groupBy ((==) `on` snd) . sortOn snd . holes3 -- all partitions of a list partitions (x:xs) = [[x]:p | p <- partitions xs] ++ [(x:ys):yss | (ys:yss) <- partitions xs] partitions [] = [[]]

The representation so far is good, and lets us express everything we want, but it’s still not very friendly to use in one common case: polymorphic monadic functions.

There are many monadic operations of the type `Monad m => m a -> m ()`

: the actual type of the first argument is ignored. At the moment, dealing with such terms requires either specialising that `a`

to each concrete type used, or using something like `void`

and specialising *that*.

Implementing full-blown Haskell polymorphism would be a pain, but this is a small and irritating enough case that it’s worth dealing with.

Presenting (trumpets please), the “ignore” type:

-- | A special type for enabling basic polymorphism. -- -- A function parameter of type @m Ignore@ unifies with values of any type @m a@, where @fmap -- (const Ignore)@ is applied to the parameter automatically. This avoids the need to clutter -- expressions with calls to 'void', or some other such function. data Ignore = Ignore deriving (Bounded, Enum, Eq, Ord, Read, Show)

`Ignore`

is going to give us our limited polymorphism, by changing the typing rules for `ap3`

and evaluation rules for `Ap3`

slightly.

**Application**: function application is as normal, with the exception that if the formal parameter has type `m Ignore`

and the actual parameter has type `m a`

, for any `a`

, then the application also succeeds:

-- | Perform a function application, if type-correct. -- -- There is a special case, see the comment of the 'Ignore' type. ap3ig :: forall m h. (Applicative m, Typeable m) => Exp3 m h -> Exp3 m h -> Maybe (Exp3 m h) ap3ig f e = case (splitTyConApp (typeOf3 f), splitTyConApp (typeOf3 e)) of ((_, [fargTy,fresTy]), (etyCon, etyArgs)) -- check if the formal parameter is of type @m Ignore@ and the actual parameter is of type @m a@ | fargTy == ignoreTy && etyCon == mtyCon && not (null etyArgs) && mtyArgs == init etyArgs -> Just (Ap3 fresTy f e) -- otherwise try normal function application | otherwise -> (\ty -> Ap3 ty f e) <$> typeOf3 f `funResultTy` typeOf3 e _ -> Nothing where ignoreTy = typeOf (pure Ignore :: m Ignore) (mtyCon, mtyArgs) = splitTyConApp (typeRep (Proxy :: Proxy m))

**Evaluation**: evaluation of applications has an analogous case. When applying a function, the type of the formal parametr is checked and, if it’s `m Ignore`

, the argument gets `fmap (const Ignore)`

applied:

-- | Evaluate a term eval3ig :: forall m. (Monad m, Typeable m) => [(String, BDynamic)] -> Term3 m -> Maybe BDynamic eval3ig globals e0 | all check (names3 e0) = Just (go [] e0) | otherwise = Nothing where go locals (Bind3 ty b e) = case (unwrapFunctor :: BDynamic -> Maybe (m BDynamic)) (go locals b) of Just mdyn -> unsafeWrapFunctor ty $ mdyn >>= \dyn -> case unwrapFunctor (go (dyn:locals) e) of Just dyn -> dyn Nothing -> error "type error I can't deal with here!" Nothing -> error "type error I can't deal with here!" go locals (Let3 _ b e) = go (go locals b : locals) e go locals v@(Var3 _ _) = case env locals v of Just dyn -> dyn Nothing -> error "environment error I can't deal with here!" go locals (Ap3 _ f e) = let f' = go locals f e' = go locals e in case f' `bdynApply` (if hasIgnoreArg f' then ignore e' else e') of Just dyn -> dyn Nothing -> error "type error I can't deal with here!" go _ (Lit3 _ dyn) = dyn env locals (Var3 _ (Bound3 n)) | length locals > n = Just (locals !! n) | otherwise = Nothing env _ (Var3 ty (Named3 s)) = case lookup s globals of Just dyn | bdynTypeRep dyn == ty -> Just dyn _ -> Nothing env _ (Var3 _ (Hole3 v)) = absurd v hasIgnoreArg fdyn = let (_, [fargTy,_]) = splitTyConApp (bdynTypeRep fdyn) in fargTy == ignoreTy ignore dyn = case (unwrapFunctor :: BDynamic -> Maybe (m BDynamic)) dyn of Just ma -> unsafeToBDynamic ignoreTy (const Ignore <$> ma) Nothing -> error "non-monadic value I can't deal with here!" -- this is unreachable ignoreTy = typeOf (pure Ignore :: m Ignore) check (s, ty) = case lookup s globals of Just dyn -> bdynTypeRep dyn == ty Nothing -> False

The final piece of the puzzle is this:

-- | Convert an arbitrary value into a dynamic value, given its type. -- -- This is unsafe because if the type is incorrect and the value is later used as that type, good -- luck. unsafeToBDynamic :: TypeRep -> a -> BDynamic unsafeToBDynamic ty = BDynamic ty . unsafeCoerce

And a demo:

λ> r <- newIORef (5::Int) λ> let double = lit3 $ toBDynamic ((\x -> x >> x >> pure ()) :: IO Ignore -> IO ()) :: Exp3 IO h λ> let addOne = lit3 $ toBDynamic (modifyIORef r (+1)) :: Exp3 IO h λ> let addTwo = fromJust $ double `ap3ig` addOne λ> let eval = fromJust $ (fromBDynamic :: BDynamic -> Maybe (IO ())) =<< eval3ig [] =<< toTerm3 addTwo λ> eval λ> readIORef r 7 λ> eval λ> readIORef r 9]]>