Generating Typed Expressions

Rep­res­ent­a­tion & Eval­u­ation of Typed Ex­pres­sions: Episode II - At­tack of the Terms

This memo is about ex­haust­ively gen­er­ating schemas. Let’s go!

{-# LAN­GUAGE Kind­Sig­na­tures #-}
{-# LAN­GUAGE Scoped­TypeVari­ables #-}
{-# LAN­GUAGE Stan­dalone­De­riving #-}

im­port Con­trol.­Monad (fil­terM)
im­port Data.­Func­tion (on)
im­port Data.IntMap (IntMap)
im­port qual­i­fied Data.IntMap as M
im­port Data.Maybe (may­beToList)
im­port Data.­Proxy (Proxy(..))
im­port Data.Semig­roup (Semig­roup, (<>))
im­port Data.Set (Set)
im­port qual­i­fied Data.Set as S
im­port Data.­Type­able (Type­able, Ty­peRep, fun­Res­ultTy, splitTy­ConApp, ty­peRep)
im­port GH­C.Exts (Any)
im­port Un­safe.­Co­erce (un­sa­fe­Co­erce)

Aside: Ex­pres­sion Size

To con­cretely tie down what we’re do­ing, we’re going to gen­erate ex­pres­sions in size or­der. Ex­pres­sion size here cor­res­ponds roughly to the number of nodes in the tree:

sizeOf :: Exp m h -> Int
sizeOf (Lit _ _) = 1
sizeOf (Var _ _) = 1
sizeOf (Bind _ b e) = 1 + sizeOf b + sizeOf e
sizeOf (Let  _ b e) = 1 + sizeOf b + sizeOf e
sizeOf (Ap   _ f e) =     sizeOf f + sizeOf e

Aside: Ex­pres­sion Order

In order to avoid du­plic­ates, we’re going to want sets of ex­pres­sions. We’re going to cheat a little be­cause we can’t ac­tu­ally com­pare BDy­namic val­ues, so we’ll just com­pare their types, and hope that that plus the string rep­res­ent­a­tions in Lit will be enough to dis­am­big­u­ate.

in­stance Eq BDy­namic  where (==) = (==) `on` bdyn­Ty­peRep
in­stance Ord BDy­namic where com­pare = com­pare `on` bdyn­Ty­peRep

de­riving in­stance Eq  h => Eq  (Var h)
de­riving in­stance Ord h => Ord (Var h)

de­riving in­stance Eq  h => Eq  (Exp m h)
de­riving in­stance Ord h => Ord (Exp m h)

Mark 1: Gen­er­ating Terms

We’re going to keep quite a simple in­ter­face for our schema gen­er­ator, we shall have:

  1. A type of gen­er­at­ors, which is a map from size to gen­er­ated schemas of that size.
  2. A func­tion to get all the schemas of a size.
  3. A func­tion to make a new gen­er­ator from a col­lec­tion of “prim­it­ive” schemas.
  4. A func­tion to gen­erate a size, as­suming all smaller sizes have been gen­er­ated.
-- | A gen­er­ator of schemas, in size or­der.
new­type Gen­er­ator1 m = Gen­er­ator1 { tiers1 :: IntMap (Set (Schema m)) }

-- | Get all schemas of the given size, if gen­er­ated.
schemas1 :: Gen­er­ator1 m -> Int -> Set (Schema m)
schemas1 g i = M.find­With­De­fault (S.empty) i (tiers1 g)

Cre­ation: to make a new gen­er­ator, we’ll just plug the provided schemas into the ap­pro­priate tiers.

-- | Create a new gen­er­ator from a set of ini­tial schemas.
cre­ate1 :: [Schema m] -> Gen­er­ator1 m
cre­ate1 ini­tial = Gen­er­ator1 $ M.uni­on­sWith S.union
  [M.singleton (sizeOf s) (S.singleton s) | s <- ini­tial]

Gen­er­a­tion: now we see the be­nefit of all the smart Maybe-re­turning con­struct­ors: we can do the in­cred­ibly naive thing of just trying all cor­rectly-s­ized com­bin­a­tions of already-­known schemas. The ones which re­turn a Just are good and shall be kept.

-- | Gen­erate schemas of the given size, as­suming all smaller tiers have been gen­er­ated.
gen­er­ate1 :: (Ap­plic­ative m, Type­able m) => Int -> Gen­er­ator1 m -> Gen­er­ator1 m
gen­er­ate1 i g = Gen­er­ator1 $ M.uni­on­sWith S.union
    [ tiers1 g
    , M.singleton i aps
    , M.singleton i binds
    , M.singleton i lets
    ]
  where
    -- sizeOf (ap f e) = 0 + sizeOf f + sizeOf e
    aps = makeTerms 0 $ \terms can­did­ates ->
      [ new | f <- terms
            , e <- can­did­ates
            , new <- may­beToList (ap f e)
      ]
    -- sizeOf (bind is b e) = 1 + sizeOf b + sizeOf e
    binds = makeTerms 1 $ \terms can­did­ates ->
      [ new | b <- terms
            , e <- can­did­ates
            , holeset <- powerset . map fst $ holes e
            , new <- may­beToList (bind holeset b e)
      ]
    -- sizeOf (let_ is b e) = 1 + sizeOf b + sizeOf e
    lets = makeTerms 1 $ \terms can­did­ates ->
      [ new | b <- terms
            , e <- can­did­ates
            , holeset <- powerset . map fst $ holes e
            , new <- may­beToList (let_ holeset b e)
      ]

    makeTerms n f = M.foldMap­WithKey go (tiers1 g) where
      go tier terms = S.from­List $
        let can­did­ates = schemas1 g (i - tier - n)
        in f (S.toList terms) (S.toList can­did­ates)

    powerset = fil­terM (const [False,True])

Here’s a small demo:

demo1 :: Gen­er­ator1 IO
demo1 = cre­ate1
  [ hole $ ty­peRep (Proxy :: Proxy Int)
  , lit "0"    . toB­Dy­namic $ (0 :: Int)
  , lit "1"    . toB­Dy­namic $ (1 :: Int)
  , lit "+"    . toB­Dy­namic $ ((+) :: Int -> Int -> Int)
  , lit "*"    . toB­Dy­namic $ ((*) :: Int -> Int -> Int)
  , lit "pure" . toB­Dy­namic $ (pure :: Int -> IO Int)
  ]

λ> let upto n = mapM_ print . S.toList $ schemas1 (foldl (flip gen­er­ate1) demo1 [1..n]) n

λ> upto 1
*
+
0
1
pure
(_ :: Int)

λ> upto 2
(pure) (0)
(pure) (1)
(pure) ((_ :: Int))
(*) (0)
(*) (1)
(*) ((_ :: Int))
(+) (0)
(+) (1)
(+) ((_ :: Int))

λ> upto 3
let <*> in <0>
let <*> in <1>
let <*> in <(_ :: Int)>
let <+> in <0>
let <+> in <1>
let <+> in <(_ :: Int)>
let <0> in <0>
let <0> in <1>
let <0> in <(_ :: Int)>
let <0> in <(b0 :: Int)>
let <1> in <0>
let <1> in <1>
let <1> in <(_ :: Int)>
let <1> in <(b0 :: Int)>
let <pure> in <0>
let <pure> in <1>
let <pure> in <(_ :: Int)>
let <(_ :: Int)> in <0>
let <(_ :: Int)> in <1>
let <(_ :: Int)> in <(_ :: Int)>
let <(_ :: Int)> in <(b0 :: Int)>
let <*> in <*>
let <*> in <+>
let <+> in <*>
let <+> in <+>
let <0> in <*>
let <0> in <+>
let <1> in <*>
let <1> in <+>
let <pure> in <*>
let <pure> in <+>
let <(_ :: Int)> in <*>
let <(_ :: Int)> in <+>
let <*> in <pure>
let <+> in <pure>
let <0> in <pure>
let <1> in <pure>
let <pure> in <pure>
let <(_ :: Int)> in <pure>
((*) (0)) (0)
((*) (0)) (1)
((*) (0)) ((_ :: Int))
((*) (1)) (0)
((*) (1)) (1)
((*) (1)) ((_ :: Int))
((*) ((_ :: In­t))) (0)
((*) ((_ :: In­t))) (1)
((*) ((_ :: In­t))) ((_ :: Int))
((+) (0)) (0)
((+) (0)) (1)
((+) (0)) ((_ :: Int))
((+) (1)) (0)
((+) (1)) (1)
((+) (1)) ((_ :: Int))
((+) ((_ :: In­t))) (0)
((+) ((_ :: In­t))) (1)
((+) ((_ :: In­t))) ((_ :: Int))

Mark 2: An­nota­tions & Pruning

The cur­rent gen­er­ator is nice and sim­ple, but pro­duces some un­in­ter­esting terms:

  • Let-­bind­ings where the body has no holes bound: let <1> in <pure>
  • Let- and mon­adic- bind­ings where the binder is a hole: let <(_ :: Int)> in <(b0 :: Int)>

Note that we do want to keep mon­adic bind­ings where the body has no holes bound, as the mon­adic bind may cause an in­ter­esting ef­fect.

Fur­ther­more, we may want to store ad­di­tional in­form­a­tion with the gen­er­ated schemas, which we can use to prune gen­er­a­tion fur­ther, and re­cord in­form­a­tion for fur­ther use.

-- | A gen­er­ator of schemas with metadata, in size or­der.
new­type Gen­er­ator2 m ann = Gen­er­ator2 { tiers2 :: IntMap (Set (Schema m, ann)) }

Gen­er­a­tion is now a bit more in­volved:

-- | Gen­erate schemas of the given size, as­suming all smaller tiers have been gen­er­ated.
gen­er­ate2 :: (Ap­plic­ative m, Type­able m, Semig­roup ann, Ord ann)
  => (ann -> ann -> Schema m -> Bool)
  -- ^ A pre­dicate to filter gen­er­ated schemas.
  -> Int
  -> Gen­er­ator2 m ann
  -> Gen­er­ator2 m ann
gen­er­ate2 annp i g = Gen­er­ator2 $ M.uni­on­sWith S.union
    [ tiers2 g
    , M.singleton i aps
    , M.singleton i binds
    , M.singleton i lets
    ]
  where
    aps = makeTerms 0 $ \terms can­did­ates ->
      [ (new, fAnn <> eAnn) -- pro­duce a new an­nota­tion by com­bining the old
        | (f, fAnn) <- terms
        , (e, eAnn) <- can­did­ates
        , new <- may­beToList (ap f e)
        -- check the new ex­pres­sion and old an­nota­tions against the pre­dicate
        , annp fAnn eAnn new
      ]

    binds = makeTerms 1 $ \terms can­did­ates ->
      [ (new, bAnn <> eAnn) -- pro­duce a new an­nota­tion by com­bining the old
        | (b, bAnn) <- terms
        -- don't allow a binder which is a hole
        , case b of Var _ (Hole _) -> False; _ -> True
        , (e, eAnn) <- can­did­ates
        , holeset <- powerset . map fst $ holes e
        , new <- may­beToList (bind holeset b e)
        -- check the new ex­pres­sion and old an­nota­tions against the pre­dicate
        , annp bAnn eAnn new
      ]

    lets = makeTerms 1 $ \terms can­did­ates ->
      [ (new, bAnn <> eAnn) -- pro­duce a new an­nota­tion by com­bining the old
        | (b, bAnn) <- terms
        -- don't allow a binder which is a hole
        , case b of Var _ (Hole _) -> False; _ -> True
        , (e, eAnn) <- can­did­ates
        , holeset <- powerset . map fst $ holes e
        -- don't allow an empty holeset
        , not (null holeset)
        , new <- may­beToList (let_ holeset b e)
        -- check the new ex­pres­sion and old an­nota­tions against the pre­dicate
        , annp bAnn eAnn new
      ]

    makeTerms n f = M.foldMap­WithKey go (tiers2 g) where
      go tier terms = S.from­List $
        let can­did­ates = schemas2 g (i - tier - n)
        in f (S.toList terms) (S.toList can­did­ates)

    powerset = fil­terM (const [False,True])

The schemas and create code are ba­sic­ally the same:

-- | Get all schemas of the given size, if gen­er­ated.
schemas2 :: Gen­er­ator2 m ann -> Int -> Set (Schema m, ann)
schemas2 g i = M.find­With­De­fault (S.empty) i (tiers2 g)

-- | Create a new gen­er­ator from a set of ini­tial schemas.
cre­ate2 :: Ord ann => [(Schema m, ann)] -> Gen­er­ator2 m ann
cre­ate2 ini­tial = Gen­er­ator2 $ M.uni­on­sWith S.union
  [M.singleton (sizeOf e) (S.singleton s) | s@(e,_) <- ini­tial]

Our demo now looks much bet­ter:

demo2 :: Gen­er­ator2 IO ()
demo2 = cre­ate2 $ map (\e -> (e, ()))
  [ hole $ ty­peRep (Proxy :: Proxy Int)
  , lit "0"    . toB­Dy­namic $ (0 :: Int)
  , lit "1"    . toB­Dy­namic $ (1 :: Int)
  , lit "+"    . toB­Dy­namic $ ((+) :: Int -> Int -> Int)
  , lit "*"    . toB­Dy­namic $ ((*) :: Int -> Int -> Int)
  , lit "pure" . toB­Dy­namic $ (pure :: Int -> IO Int)
  ]

λ> let upto n = mapM_ print . S.toList $ schemas2 (foldl (flip $ gen­er­ate2 \_ _ _ -> True) demo2 [1..n]) n

λ> upto 1
((*,())
(+,())
(0,())
(1,())
(pure,())
((_ :: In­t),())

λ> upto 2
((pure) (0),())
((pure) (1),())
((pure) ((_ :: In­t)),())
((*) (0),())
((*) (1),())
((*) ((_ :: In­t)),())
((+) (0),())
((+) (1),())
((+) ((_ :: In­t)),())

λ> upto 3
(let <0> in <(b0 :: In­t)>,())
(let <1> in <(b0 :: In­t)>,())
(((*) (0)) (0),())
(((*) (0)) (1),())
(((*) (0)) ((_ :: In­t)),())
(((*) (1)) (0),())
(((*) (1)) (1),())
(((*) (1)) ((_ :: In­t)),())
(((*) ((_ :: In­t))) (0),())
(((*) ((_ :: In­t))) (1),())
(((*) ((_ :: In­t))) ((_ :: In­t)),())
(((+) (0)) (0),())
(((+) (0)) (1),())
(((+) (0)) ((_ :: In­t)),())
(((+) (1)) (0),())
(((+) (1)) (1),())
(((+) (1)) ((_ :: In­t)),())
(((+) ((_ :: In­t))) (0),())
(((+) ((_ :: In­t))) (1),())
(((+) ((_ :: In­t))) ((_ :: In­t)),())

Ap­pendix: Ex­pres­sions

The Mark 3-ig ex­pres­sion types and smart con­struct­ors:

data BDy­namic = BDy­namic { bdyn­Ty­peRep :: Ty­peRep, bdynAny :: Any }

toB­Dy­namic :: forall a. Type­able a => a -> BDy­namic
toB­Dy­namic a = BDy­namic (ty­peRep (Proxy :: Proxy a)) (un­sa­fe­Co­erce a)

data Exp (m :: * -> *) (h :: *)
  = Lit String BDy­namic
  | Var Ty­peRep (Var h)
  | Bind Ty­peRep (Exp m h) (Exp m h)
  | Let  Ty­peRep (Exp m h) (Exp m h)
  | Ap   Ty­peRep (Exp m h) (Exp m h)

in­stance Show (Exp m h) where
  show (Lit s _) = s
  show (Var ty v) = "(" ++ show v ++ " :: " ++ show ty ++ ")"
  show (Bind _ b e) = "bind <" ++ show b ++ "> in <" ++ show e ++ ">"
  show (Let  _ b e) = "let <" ++ show b ++ "> in <" ++ show e ++ ">"
  show (Ap   _ f e) = "(" ++ show f ++ ") (" ++ show e ++ ")"

data Var h
  = Hole  h
  | Named String
  | Bound Int

in­stance Show (Var h) where
  show (Hole  _) = "_"
  show (Named s) = s
  show (Bound i) = 'b' : show i

type Schema m = Exp m ()

data Ig­nore = Ig­nore de­riving (Bounded, Enum, Eq, Ord, Read, Show)

typeOf :: Exp m h -> Ty­peRep
typeOf (Lit  _ dyn)  = bdyn­Ty­peRep dyn
typeOf (Var  ty _)   = ty
typeOf (Bind ty _ _) = ty
typeOf (Let  ty _ _) = ty
typeOf (Ap   ty _ _) = ty

lit :: String -> BDy­namic -> Exp m h
lit = Lit

hole :: Ty­peRep -> Schema m
hole ty = Var ty (Hole ())

let_ :: [Int] -> Schema m -> Schema m -> Maybe (Schema m)
let_ is b e0 = Let (typeOf e0) b <$> let­Or­Bind is (typeOf b) e0

bind :: forall m. Type­able m => [Int] -> Schema m -> Schema m -> Maybe (Schema m)
bind is b e0 = case (splitTy­ConApp (typeOf b), splitTy­ConApp (typeOf e0)) of
    ((btyCon, bt­yArgs), (etyCon, et­yArgs))
      | btyCon == mtyCon && btyCon == etyCon && not (null bt­yArgs) && not (null et­yArgs) && mt­yArgs == init bt­yArgs && init bt­yArgs == init et­yArgs
        -> Bind (typeOf e0) b <$> let­Or­Bind is (last bt­yArgs) e0
    _ -> Nothing
  where
    (mtyCon, mt­yArgs) = splitTy­ConApp (ty­peRep (Proxy :: Proxy m))

ap :: forall m h. (Ap­plic­ative m, Type­able m) => Exp m h -> Exp m h -> Maybe (Exp m h)
ap f e = case (splitTy­ConApp (typeOf f), splitTy­ConApp (typeOf e)) of
    ((_, [fargTy,fresTy]), (etyCon, et­yArgs))
      | fargTy == ig­noreTy && etyCon == mtyCon && not (null et­yArgs) && mt­yArgs == init et­yArgs -> Just (Ap fresTy f e)
      | oth­er­wise -> (\ty -> Ap ty f e) <$> typeOf f `fun­Res­ultTy` typeOf e
    _ -> Nothing
  where
    ig­noreTy = ty­peRep (Proxy :: Proxy (m Ig­nore))
    (mtyCon, mt­yArgs) = splitTy­ConApp (ty­peRep (Proxy :: Proxy m))

let­Or­Bind :: [Int] -> Ty­peRep -> Exp m h -> Maybe (Exp m h)
let­Or­Bind is boundTy e0 = fst <$> go 0 0 e0 where
  go n i (Var ty (Hole h))
    | i `elem` is = if boundTy == ty then Just (Var ty (Bound n), i + 1) else Nothing
    | oth­er­wise   = Just (Var ty (Hole h), i + 1)
  go n i (Bind ty b e) = do
    (b', i')  <- go n     i  b
    (e', i'') <- go (n+1) i' e
    Just (Bind ty b' e', i'')
  go n i (Let ty b e) = do
    (b', i')  <- go n     i  b
    (e', i'') <- go (n+1) i' e
    Just (Let ty b' e', i'')
  go n i (Ap ty f e) = do
    (f', i')  <- go n i  f
    (e', i'') <- go n i' e
    Just (Ap ty f' e', i'')
  go _ i e = Just (e, i)

holes :: Schema m -> [(Int, Ty­peRep)]
holes = fst . go 0 where
  go i (Var ty (Hole _)) = ([(i, ty)], i + 1)
  go i (Let _ b e) =
    let (bhs, i')  = go i  b
        (ehs, i'') = go i' e
    in (bhs ++ ehs, i'')
  go i (Ap _ f e) =
    let (fhs, i')  = go i  f
        (ehs, i'') = go i' e
    in (fhs ++ ehs, i'')
  go i _ = ([], i)
Date
Tags
coco, code generation, haskell, programming, research
Target Audience
Haskell programmers.
Epistemic Status
I wrote this memo to work out how to implement CoCo. So this all works and is, mostly, still implemented like this.