Implementing Polymorphism

Date
Tags coco, haskell, programming, research, type checking
Target Audience Haskell programmers.
Epistemic Status I wrote this memo to work out how to implement CoCo. So this all works and is, mostly, still implemented like this.

Poly­morphism is great, it means that we don’t need to re­peat the same con­cep­tu­ally-­poly­morphic defin­i­tion at mul­tiple mono­morphic types (I’m looking at you, Go). It un­for­tu­nately means that we need some­thing a little smarter than equality for de­term­ining if a func­tion ap­plic­a­tion is well-­typed.

As with the other CoCo memos, this is a Lit­erate Haskell file. In the rendered out­put, high­lighted source is lit­erate source, non-high­lighted source is just com­ment­ary.

We’re going to re-use the Type­able ma­chinery as much as pos­sible, as im­ple­menting it all our­self is nasty.

{-# LAN­GUAGE Data­Kinds #-}
{-# LAN­GUAGE Poly­Kinds #-}

im­port Data.­Func­tion (on)
im­port Data.List (nub)
im­port Data.Maybe (from­Maybe, is­Just)
im­port Data.­Proxy (Proxy(..))
im­port Data.­Type­able

Type Vari­ables

Firstly, we need a rep­res­ent­a­tion of type vari­ables. We can’t use ac­tual type vari­ables, be­cause Type­able only works with mono­morphic types. So we’ll need to in­tro­duce some spe­cific types that we shall treat as vari­ables during type-check­ing. We can get an ar­bit­rary number of these by using a poly­morphic type:

data TyVar t = TyVar de­riving (Bounded, Enum, Eq, Ord, Read, Show)

For con­veni­ence, we shall also have four named type vari­ables:

type A = TyVar 0
type B = TyVar 1
type C = TyVar 2
type D = TyVar 3

If any type needs more than four dis­tinct type vari­ables, it can al­ways in­tro­duce its own.

Fi­nally, we can check if a type is a type vari­able:

isTypeVar :: Ty­peRep -> Bool
isTypeVar = ((==) `on` (fst . splitTy­ConApp)) (ty­peRep (Proxy :: Proxy A))

Typing Func­tion Ap­plic­a­tions

The Data.­Type­able module has a handy little func­tion for checking that a func­tion ap­plic­a­tion is well-­typed:

fun­Res­ultTy :: Ty­peRep -> Ty­peRep -> Maybe Ty­peRep

which takes the func­tion type, the ar­gu­ment type, and re­turns the result type if the ap­plic­a­tion is well-­typed. We want some­thing sim­ilar, but sup­porting our type vari­ables. We don’t just want equal­ity, want uni­fic­a­tion. We want a func­tion like so:

-- | Ap­plies a type to a func­tion type.  If type-­cor­rect, re­turns an en­vir­on­ment binding type
-- vari­ables to types, and the result type (with bind­ings ap­plied).
poly­Fun­Res­ultTy :: Ty­peRep -> Ty­peRep -> Maybe (TypeEnv, Ty­peRep)
poly­Fun­Res­ultTy fty aty = do
  -- get the func­tion and ar­gu­ment types
  let funTyCon = ty­peR­epTyCon (ty­peRep (Proxy :: Proxy (() -> ())))
  (argTy, res­ultTy) <- case splitTy­ConApp fty of
    (con, [argTy, res­ultTy]) | con == funTyCon -> Just (argTy, res­ultTy)
    _ -> Nothing
  -- pro­duce a (possibly empty) type en­vir­on­ment
  env <- unify aty argTy
  -- apply the type en­vir­on­ment to the result
  pure (env, ap­ply­Bind­ings env res­ultTy)

Type En­vir­on­ments: Firstly, let’s define what a type en­vir­on­ment ac­tu­ally is. We’ll keep it very sim­ple: just a map from types to types:

-- | An en­vir­on­ment of type bind­ings.
type TypeEnv = [(Ty­peRep, Ty­peRep)]

The first in every tuple will be a TyVar, but there isn’t really a good way to stat­ic­ally en­force that.

Now that we have type en­vir­on­ments, let’s apply them to a type. This need not make a type fully mono­morphic, the bind­ings may not cover every type vari­able used, or may define some type vari­ables in terms of others (as long as a vari­able isn’t defined in terms of it­self). This is fine.

-- | Apply type en­vir­on­ment bind­ings to a type.
ap­ply­Bind­ings :: TypeEnv -> Ty­peRep -> Ty­peRep
ap­ply­Bind­ings env = go where
  go ty
    -- look up type vari­ables in the en­vir­on­ment, but fall-­back to the naked vari­able if not found
    | isTypeVar ty = from­Maybe ty (lookup ty env)
    -- oth­er­wise con­tinue re­curs­ively through type con­structors
    | oth­er­wise = let (con, args) = splitTy­ConApp ty in mk­Ty­ConApp con (map go args)

Uni­fic­a­tion: For reasons that will be­come ap­parent later, we’re going to define a few vari­ants of this.

Firstly, our standard uni­fic­a­tion func­tion. It’ll take two types (the order doesn’t mat­ter) and at­tempt to unify them. Two types unify if:

  1. They’re struc­tur­ally equal; OR
  2. At least one is a type vari­able; OR
  3. They have the same con­structor with the same number of ar­gu­ments, and all the ar­gu­ments unify, with com­pat­ible en­vir­on­ments.

Hey, that sounds like a re­cursive func­tion!

-- | At­tempt to unify two types.
unify :: Ty­peRep -> Ty­peRep -> Maybe [(Ty­peRep, Ty­peRep)]
unify = unify' True

This next func­tion is the ac­tual work­horse. It im­ple­ments the re­cursive de­cision pro­cedure de­scribed above and con­structs the en­vir­on­ment. It takes a flag to de­termine if uni­fying with a naked type vari­able is al­lowed here. It al­ways is in the re­cursive case. This will turn out to be useful in the next sec­tion, when we’re talking about poly­morphic uses of the state type.

-- | At­tempt to unify two types.
unify'
  :: Bool
  -- ^ Whether to allow either type to be a naked type vari­able at this level (al­ways true in
  -- lower levels).
  -> Ty­peRep -> Ty­peRep -> Maybe TypeEnv
unify' b tyA tyB
    -- check equality
    | tyA == tyB = Just []
    -- check if one is a naked type vari­able
    | isTypeVar tyA = if not b || oc­curs tyA tyB then Nothing else Just [(tyA, tyB)]
    | isTypeVar tyB = if not b || oc­curs tyB tyA then Nothing else Just [(tyB, tyA)]
    -- de­con­struct each and at­tempt to unify sub­com­pon­ents
    | oth­er­wise =
      let (conA, argsA) = splitTy­ConApp tyA
          (conB, argsB) = splitTy­ConApp tyB
      in if conA == conB && length argsA == length argsB
         then uni­fy­Accum True id argsA argsB
         else Nothing
  where
    -- check if a type oc­curs in an­other
    oc­curs needle hay­stack = needle == hay­stack || any (oc­curs needle) (snd (splitTy­ConApp hay­stack))

The re­cursive listy case is handled by this uni­fy­Accum func­tion, which is mu­tu­ally re­cursive with unify':

-- | An ac­cu­mu­lating unify: at­tempts to unify two lists of types pair­wise and checks that the
-- res­ulting as­sign­ments do not con­flict with the cur­rent type en­vir­on­ment.
uni­fy­Accum :: Bool -> (Maybe TypeEnv -> Maybe TypeEnv) -> [Ty­peRep] -> [Ty­peRep] -> Maybe TypeEnv
uni­fy­Accum b f as bs = foldr go (Just []) (zip as bs) where
  go (tyA, tyB) (Just env) =
    uni­fy­TypeEnvs b env =<< f (unify' b tyA tyB)
  go _ Nothing = Nothing

The final piece of the uni­fic­a­tion puzzle is how to com­bine type en­vir­on­ments. This is ne­ces­sary to be able to unify types like T A A and T Int B. One op­tion is to en­force equality of bind­ings, but that is too re­strictive (it won’t work in the T ex­ample, as Int is not B, yet both do uni­fy). The cor­rect solu­tion is to unify the bind­ings. This is yet an­other mu­tu­ally re­cursive func­tion:

-- | Unify two type en­vir­on­ments, if pos­sible.
uni­fy­TypeEnvs :: Bool -> TypeEnv -> TypeEnv -> Maybe TypeEnv
uni­fy­TypeEnvs b env1 env2 = foldr go (Just []) (nub $ map fst env1 ++ map fst env2) where
  go tyvar acc@(Just env) = case (lookup tyvar env, lookup tyvar env1, lookup tyvar env2) of
    (_, Just ty1, Just ty2) -> uni­fy­TypeEnvs b env . ((tyvar, ty1):) =<< unify' b ty1 ty2
    (x, Just ty1, _)
      | is­Just x  -> uni­fy­TypeEnvs b env [(tyvar, ty1)]
      | oth­er­wise -> Just ((tyvar, ty1):env)
    (x, _, Just ty2)
      | is­Just x  -> uni­fy­TypeEnvs b env [(tyvar, ty2)]
      | oth­er­wise -> Just ((tyvar, ty2):env)
    _ -> acc
  go _ Nothing = Nothing

And now, an ex­ample:

λ> data T a b = T
λ> unify (typeOf (un­defined :: T A A)) (typeOf (un­defined :: T Int B))
Just [(TyVar Nat 0,TyVar Nat 1),(TyVar Nat 1,Int)]

λ> let funTy = typeOf (un­defined :: A -> B -> Bool -> Either A B)
λ> poly­Fun­Res­ultTy funTy (typeOf (un­defined::Int))
Just ([(TyVar Nat 0,Int)],TyVar Nat 1 -> Bool -> Either Int (TyVar Nat 1))

Mono­morph­ising the State Type

In a CoCo sig­na­ture, the state type is mono­morphic, but it can be nice to treat it as poly­morphic for two reas­ons:

  1. Needing to change the types in­side the sig­na­ture be­cause the type of the sig­na­ture changed is a pain.
  2. It helps avoid re­pe­ti­tion.

Here are a couple of func­tion types that may ap­pear in a sig­na­ture:

MVar Con­cur­rency Int ->        Con­cur­rency Int
MVar Con­cur­rency Int -> Int -> Con­cur­rency ()

If the state type is MVar Con­cur­rency Int, then (1) is saying that if we change it to MVar Con­cur­rency Bool we now need to change those two types above, and (2) is saying that we’re need­lessly re­peating the Int. It would be much nicer to have these types in the sig­na­ture:

MVar Con­cur­rency A ->      Con­cur­rency A
MVar Con­cur­rency A -> A -> Con­cur­rency ()

Now that we have im­ple­mented type uni­fic­a­tion, we can do this! For a func­tion type in the sig­na­ture:

  1. Try to unify every ar­gu­ment, ex­cluding naked type vari­ables, against the state type.
  2. Check that the en­vir­on­ments are com­pat­ible.
  3. Apply the com­bined en­vir­on­ment to the func­tion type.

This ef­fect of this is to mono­morphise poly­morphic uses of the state type. This is good be­cause the state type often de­term­ines other ar­gu­ment types, and once we know the con­crete types of func­tion ar­gu­ments we can infer what hole types we need. If func­tions have totally poly­morphic types, hole in­fer­ence doesn’t work so well.

-- | Mono­morphise poly­morphic uses of the state type in a func­tion type.
mono­morphise :: Type­able s
  => Proxy s -- ^ The state type.
  -> Ty­peRep -- ^ The func­tion type.
  -> Ty­peRep
mono­morphise s ty0 = from­Maybe ty0 $ do
  let stateTy = ty­peRep s
  ar­gTys <- fun­Ar­gTys ty0
  env    <- uni­fy­Accum False (maybe (Just []) Just) (re­peat stateTy) ar­gTys
  pure (ap­ply­Bind­ings env ty0)

Now we see the pur­pose of the boolean ar­gu­ment to unify'. If we have the type A -> MVar Con­cur­rency A -> Con­cur­rency (), we don’t want to unify MVar Con­cur­rency Int with A, we want to skip that over! More gen­er­ally, we want to avoid uni­fying a fully-­poly­morphic type with our state type but, as all our type vari­ables have kind *, just pre­venting the top-­level uni­fic­a­tion suf­fices.

Oh, we’ll also need this helper func­tion to get all the ar­gu­ment types of a func­tion:

-- | Get all of a func­tion's ar­gu­ment types.  Re­turns @Noth­ing@ if not a func­tion type.
fun­Ar­gTys :: Ty­peRep -> Maybe [Ty­peRep]
fun­Ar­gTys ty = case splitTy­ConApp ty of
    (con, [argTy, res­ultTy]) | con == funTyCon -> Just $
      argTy : from­Maybe [] (fun­Ar­gTys res­ultTy)
    _ -> Nothing
  where
    funTyCon = ty­peR­epTyCon (ty­peRep (Proxy :: Proxy (() -> ())))

And we’re done!

λ> let s = Proxy :: Proxy (Either Int Bool)
λ> let f = typeOf (un­defined :: A -> Either A B -> B)
λ> mono­morphise s f
Int -> Either Int Bool -> Bool