Weakening constraints on rank-2 types

Question

{-# LANGUAGE RankNTypes #-}

Continuing from a previous series of questions, I have a function with a universally quantified function as argument, like so:

emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b

Which can be used just fine for functions that need no additional constraints, such as:

postmap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
postmap f = f . emap (postmap f)

reduce = postmap step

However, I now want to use this function with a function that takes an additional constraint, but the following does not type-check.

substitute :: Pack a => Identifier -> a -> Expression a -> Expression a
substitute i v (Var x) | x == i = pack v
substitute _ _ x = x


bind :: Pack a => Identifier -> a -> Expression a -> Expression a
bind ident value = postmap (subsitute ident value)

So it seems that I somehow need to 'weaken' or 'specialize' the forall a constraint to a forall a. Pack a constraint. It seems unnecessary to add the constraint to the signature of emap itself, but I can't see any other approach to solving this.

I'm not sure how to work around this, but I can see a few options available.

Add the constraint around the signature for bind such that it can be happily type-checked.
Create another function emap' with the constraint, and implement emap' in terms of emap or vice-versa (emap is a boring mechanical definition and it doesn't make sense to have two identical function bodies differing only by signature).
Recognise what I'm trying to do as a code smell and proceed with an alternative solution. (???)

Can you show the code for the `substitute` function? There might be a much easier solution. — bennofs, Aug 20 '14 at 13:03

score 5 · Answer 1 · answered Aug 20 '14 at 09:19

Suppose you rewrote everything such that type T = forall a . Expression a -> Expression a and postmap' :: T -> T. I think it should be clear that the type of postmap is the same. (You can try >:t [postmap', postmap]). Given that substitute :: Pack a => Identifier -> a -> Expression a -> Expression a, consider your definition of bind: you pass an argument to postmap which expects a value of type T; but you have given it a value of type Expression a -> Expression a. Be careful! These types are not the same, since in the latter case, the forall a is somewhere to the left; postmap is expecting a function where it can pick the a, but it has been given a function were the a is already chosen by someone else. Suppose your example were to typecheck, and then you call bind with a ~ (). The type of the function given to postmap must be Expression () -> Expression (), but this is clearly nonsense.

If I am not very clear, consider the "working" version:

type T = forall a . Expression a -> Expression a 

postmap :: T -> T 
postmap = undefined

-- These two substitutes have different types!
substitute :: Pack a => Identifier -> a -> Expression a -> Expression a
substitute = undefined

substitute' :: Pack a => Identifier -> a -> T
substitute' = undefined

-- Doesn't compile 
bind :: Pack a => Identifier -> a -> Expression a -> Expression a
bind ident value = postmap (substitute ident value)

-- Does compile!
bind' :: Pack a => Identifier -> a -> T
bind' ident value = postmap (substitute' ident value)

The important thing to note: whether you define as above or type T = forall a . Pack a => ..., the error is the same. So your problem is at a different point than you think it is.

The easy way to debug these kinds of problems is to use

newtype T = T (forall a . Expression a -> Expression a)

the errors are often much clearer (though not in this case). I apologize that I can't give a real solution, as I don't know exactly what these functions are actually doing. But I suspect #3 is your answer.

Thanks, although this isn't a 'solution' to the problem, it gave me a much better understanding of what my problem is! — guhou, Aug 21 '14 at 04:28

score 3 · Answer 2 · edited May 23 '17 at 11:43

To start with, let's flesh out your example with enough code to work with. LiteralToken is just made up to represent whatever you are replacing a variable with.

{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}

type Identifier = String

data LiteralToken = LiteralToken
    deriving Show

class Pack a where
    pack :: a -> LiteralToken

instance Pack Int where
    pack = const LiteralToken

data Expression a where
    Var :: Identifier -> Expression a
    Lit :: LiteralToken -> Expression a
    Tuple :: Expression a -> Expression b -> Expression (a, b)

deriving instance Show (Expression a)

emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
    Tuple a b -> Tuple (f a) (f b)
    otherwise -> e

postmap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
postmap f = f . emap (postmap f)

bind :: Pack a => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)

Now, we can implement substitute for an Identifier. To be used in emap or postmap, substitute needs to be able to operate on all parts of the Expression tree, regardless of whether it's the same type as the variable being substituted for. That's why the type variable for the Expressions is different than the type variable for the value (this is also what user2407038 explains).

substitute :: Pack a => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
    Var id | id == ident -> Lit (pack value)
    otherwise            -> e

Now we can try some examples:

example1 :: Expression (Int, Bool)
example1 = Tuple (Var "x") (Var "y")

print . bind "x" (7 :: Int) $ example1 
Tuple (Lit LiteralToken) (Var "y")

If we have the wrong type for the identifier, we don't get what we want, it's substituted anyway. That's because substitute had no way to check the type of the variable it was substituting for. In the following example, both Var "x"s are replaced with the literal that was made for an Int, even though the second variable was for a Bool.

example2 :: Expression (Int, Bool)
example2 = Tuple (Var "x") (Var "x")

print . bind "x" (7 :: Int) $ example2
Tuple (Lit LiteralToken) (Lit LiteralToken)

Type safety

If we add a type variable to LiteralToken so that it somehow depends on the type of what it represents

data LiteralToken a = LiteralToken
    deriving Show

class Pack a where
    pack :: a -> LiteralToken a

data Expression a where
    Var :: Identifier -> Expression a
    Lit :: LiteralToken a -> Expression a
    Tuple :: Expression a -> Expression b -> Expression (a, b)

our previous substitute will have a compiler error.

Could not deduce (b ~ a)
...
In the first argument of `pack', namely `value'
In the first argument of `Lit', namely `(pack value)'
In the expression: Lit (pack value)

substitute needs some way to check that the type it is operating on is correct. Data.Typeable solves this problem. It doesn't seem unreasonable to require that any term represented by a variable have a runtime identifiable type, so it's reasonable to add this constraint to the Expression tree itself. Alternatively, we could add a type annotation to the expression tree that can provide a proof that any term is of a specific type. We'll follow this second route. This will require a few imports.

import Data.Typeable
import Data.Maybe

Here's the expression tree extended to include type annotations.

data Expression a where
    Var :: Identifier -> Expression a
    Lit :: LiteralToken a -> Expression a
    Tuple :: Expression a -> Expression b -> Expression (a, b)
    Typed :: Typeable a => Expression a -> Expression a

emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
    Tuple a b -> Tuple (f a) (f b)
    Typed a   -> Typed (f a)
    otherwise -> e

bind and substitute can now only work on Typeable variables. Substituting a variable lacking a type annotation does nothing, the variable remains untouched.

bind :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)

substitute :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
    Typed (Var id) | id == ident -> fromMaybe e . fmap Typed . gcast . Lit . pack $ value
    otherwise                    -> e

All of the type checking and casting work in substitute is done by gcast. Typeable a from the function's signature provides the proof that the a in the LiteralToken a built by Lit . pack $ value has a Typeable instance. The Typed constructor in the case provides the proof that the output expression's type b also has a Typeable instance. Note that the code would still work if fmap Typed were removed; it is simply preserving the type annotation.

The following two functions make it easy to add type annotations

typed :: Typeable a => Expression a -> Expression a
typed t@(Typed _) = t
typed e           = Typed e

typedVar :: Typeable a => Identifier -> Expression a
typedVar = Typed . Var

Our two examples now do what we'd like them to do. Both examples only replace the integer variable, despite both variable names being the same in the second example. We use the smart typedVar constructor instead of Var to build all of our variables with a type annotation.

example1 :: Expression (Int, Bool)
example1 = Tuple (typedVar "x") (typedVar "y")

print . bind "x" (7 :: Int) $ example1
Tuple (Typed (Lit LiteralToken)) (Typed (Var "y"))

example2 :: Expression (Int, Bool)
example2 = Tuple (typedVar "x") (typedVar "x")

print . bind "x" (7 :: Int) $ example2
Tuple (Typed (Lit LiteralToken)) (Typed (Var "x"))

Type inference

Just for fun, we can implement type inference on expression trees. Unlike Haskell's, this really simple type inference only works from the leaves towards the root.

inferType :: Expression a -> Expression a
inferType e = case e of
    Typed t@(Typed _)             -> t
    t@(Tuple (Typed _) (Typed _)) -> typed t
    otherwise                     -> e

inferTypes = postmap inferType

print . inferTypes $ example1
Typed (Tuple (Typed (Var "x")) (Typed (Var "y")))

I've not used Data.Typeable before, can you recommend an introduction or tutorial for it? I'm unfamiliar with the library, but would it be possible for type mismatches on a var to be an error rather than silent failure? — guhou, Aug 21 '14 at 04:38
Data.Typeable's `cast` and `gcast` both return a `Maybe whatever`. If it can't be cast, it returns `Nothing`, which you can detect to do something else. Getting errors out of the traversal of the expression tree is a bit tricky; it requires something more powerful than `emap`; `emap` doesn't give us any way to return something other than an `Expression`. — Cirdec, Aug 21 '14 at 13:50
Here's how to walk the expression tree with any `Applicative` `Functor`, which can get the errors out. http://stackoverflow.com/a/25389564/414413 Here's how to treat the tree as a rose tree of lists http://stackoverflow.com/a/25393871/414413 If you can't figure out how either of those can solve the problem, how to get errors out of a transformation of an expression tree would make another good question. — Cirdec, Aug 21 '14 at 13:53

Weakening constraints on rank-2 types

2 Answers2

Type safety

Type inference