To start with, let's flesh out your example with enough code to work with. LiteralToken
is just made up to represent whatever you are replacing a variable with.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
type Identifier = String
data LiteralToken = LiteralToken
deriving Show
class Pack a where
pack :: a -> LiteralToken
instance Pack Int where
pack = const LiteralToken
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
deriving instance Show (Expression a)
emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
Tuple a b -> Tuple (f a) (f b)
otherwise -> e
postmap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
postmap f = f . emap (postmap f)
bind :: Pack a => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)
Now, we can implement substitute
for an Identifier
. To be used in emap
or postmap
, substitute
needs to be able to operate on all parts of the Expression
tree, regardless of whether it's the same type as the variable being substituted for. That's why the type variable for the Expression
s is different than the type variable for the value (this is also what user2407038 explains).
substitute :: Pack a => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
Var id | id == ident -> Lit (pack value)
otherwise -> e
Now we can try some examples:
example1 :: Expression (Int, Bool)
example1 = Tuple (Var "x") (Var "y")
print . bind "x" (7 :: Int) $ example1
Tuple (Lit LiteralToken) (Var "y")
If we have the wrong type for the identifier, we don't get what we want, it's substituted anyway. That's because substitute
had no way to check the type of the variable it was substituting for. In the following example, both Var "x"
s are replaced with the literal that was made for an Int
, even though the second variable was for a Bool
.
example2 :: Expression (Int, Bool)
example2 = Tuple (Var "x") (Var "x")
print . bind "x" (7 :: Int) $ example2
Tuple (Lit LiteralToken) (Lit LiteralToken)
Type safety
If we add a type variable to LiteralToken
so that it somehow depends on the type of what it represents
data LiteralToken a = LiteralToken
deriving Show
class Pack a where
pack :: a -> LiteralToken a
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken a -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
our previous substitute
will have a compiler error.
Could not deduce (b ~ a)
...
In the first argument of `pack', namely `value'
In the first argument of `Lit', namely `(pack value)'
In the expression: Lit (pack value)
substitute
needs some way to check that the type it is operating on is correct. Data.Typeable
solves this problem. It doesn't seem unreasonable to require that any term represented by a variable have a runtime identifiable type, so it's reasonable to add this constraint to the Expression
tree itself. Alternatively, we could add a type annotation to the expression tree that can provide a proof that any term is of a specific type. We'll follow this second route. This will require a few imports.
import Data.Typeable
import Data.Maybe
Here's the expression tree extended to include type annotations.
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken a -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
Typed :: Typeable a => Expression a -> Expression a
emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
Tuple a b -> Tuple (f a) (f b)
Typed a -> Typed (f a)
otherwise -> e
bind
and substitute
can now only work on Typeable
variables. Substituting a variable lacking a type annotation does nothing, the variable remains untouched.
bind :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)
substitute :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
Typed (Var id) | id == ident -> fromMaybe e . fmap Typed . gcast . Lit . pack $ value
otherwise -> e
All of the type checking and casting work in substitute
is done by gcast
. Typeable a
from the function's signature provides the proof that the a
in the LiteralToken a
built by Lit . pack $ value
has a Typeable
instance. The Typed
constructor in the case
provides the proof that the output expression's type b
also has a Typeable
instance. Note that the code would still work if fmap Typed
were removed; it is simply preserving the type annotation.
The following two functions make it easy to add type annotations
typed :: Typeable a => Expression a -> Expression a
typed t@(Typed _) = t
typed e = Typed e
typedVar :: Typeable a => Identifier -> Expression a
typedVar = Typed . Var
Our two examples now do what we'd like them to do. Both examples only replace the integer variable, despite both variable names being the same in the second example. We use the smart typedVar
constructor instead of Var
to build all of our variables with a type annotation.
example1 :: Expression (Int, Bool)
example1 = Tuple (typedVar "x") (typedVar "y")
print . bind "x" (7 :: Int) $ example1
Tuple (Typed (Lit LiteralToken)) (Typed (Var "y"))
example2 :: Expression (Int, Bool)
example2 = Tuple (typedVar "x") (typedVar "x")
print . bind "x" (7 :: Int) $ example2
Tuple (Typed (Lit LiteralToken)) (Typed (Var "x"))
Type inference
Just for fun, we can implement type inference on expression trees. Unlike Haskell's, this really simple type inference only works from the leaves towards the root.
inferType :: Expression a -> Expression a
inferType e = case e of
Typed t@(Typed _) -> t
t@(Tuple (Typed _) (Typed _)) -> typed t
otherwise -> e
inferTypes = postmap inferType
print . inferTypes $ example1
Typed (Tuple (Typed (Var "x")) (Typed (Var "y")))