Printing an AST with variable names

Question

I am trying to implement an EDSL in Haskell. I would like to pretty print the AST with the variable names that are bound (if I can't get the real names then some generated names would do).

This is how far I have got with a simple example:

import Control.Monad.State

data Free f a = Roll (f (Free f a))
              | Pure a

instance Functor f => Monad (Free f) where
  return         = Pure
  (Pure a) >>= f = f a
  (Roll f) >>= g = Roll $ fmap (>>= g) f

data Expr a = I a
            | Plus (Expr a) (Expr a)
            deriving (Show)

data StackProgram a next = Pop  (a -> next)
                         | Push a next

instance Functor (StackProgram a) where
  fmap f (Pop    k) = Pop (f.k)
  fmap f (Push i x) = Push i (f x)

liftF :: Functor f => f a -> Free f a
liftF l = Roll $ fmap return l

push :: a -> Free (StackProgram a) ()
push i = liftF $ Push i ()

pop :: Free (StackProgram a) a
pop = liftF $ Pop id

prog3 :: Free (StackProgram (Expr Int)) (Expr Int)
prog3 = do
  push (I 3)
  push (I 4)
  a <- pop
  b <- pop
  return (Plus a b)

showSP' :: (Show a, Show b) => Free (StackProgram a) b -> [a] -> State Int String
showSP' (Pure a)           _        = return $ "return " ++ show a
showSP' (Roll (Pop f))    (a:stack) = do 
  i <- get
  put (i+1)
  rest <- showSP' (f a) stack
  return $ "var" ++ show i ++ " <- pop " ++ show (a:stack) ++ "\n" ++ rest
showSP' (Roll (Push i n))  stack    = do
  rest <- showSP' n (i:stack) 
  return $ "push " ++ show i ++ " " ++ show stack ++ "\n" ++ rest

showSP :: (Show a, Show b) => Free (StackProgram a) b -> [a] -> String
showSP prg stk = fst $ runState (showSP' prg stk) 0

Running this gives:

*Main> putStrLn $ showSP prog3 []
push I 3 []
push I 4 [I 3]
var0 <- pop [I 4,I 3]
var1 <- pop [I 3]
return Plus (I 4) (I 3)

So what I want is to replace Plus (I 4) (I 3) with Plus var0 var1. I have thought about walking through the rest of the tree and replacing the bound variables with name-value tuples, but I am not 100% sure if/how that would work. I'd also prefer to keep the original variable names, but I can't think of an easy way of doing this. I would prefer to have a fairly light-weight syntax in haskell (kind of as above).

I would also appreciate pointers to material that teaches me how to best do these kinds of things. I have read a bit on free monads and GADTs, but I guess I am missing how to put it all together.

Have you considered using an existing library [such as this one](http://hackage.haskell.org/package/bound) instead? — C. A. McCann, Feb 16 '13 at 00:06
Check out [this answer](http://stackoverflow.com/a/14084654/1026598) I gave to a similar question. Is that close to what you had in mind? — Gabriella Gonzalez, Feb 16 '13 at 08:41
@GabrielGonzalez Oh god, that answer is awesome! enthusiastically upvoted! (also love your blog, btw, where the free monad stuff is obviously comming from). Yes, this answer is exactly what I was looking for. I haven't quite understood how you're dealing with variable equivalence, but I think I just need to look at it for more than 5 minutes. thanks again! — Paul, Feb 16 '13 at 09:11

Ganesh Sittampalam · Accepted Answer · 2013-02-16T07:48:39.620

With the structure you have, you can't do this in "pure" Haskell code, because once your code is compiled, you can't distinguish (Plus a b) from (Plus (I 4) (I 3)) and keep "referential transparency" - the interchangeability of variables and their values.

However there are unsafe hacks - i.e. not guaranteed to work - that can let you do this kind of thing. They generally go under the name "observable sharing" and are based on getting access to the internals of how values are represented, using StableName. Essentially that gives you a pointer equality operation that allows you to distinguish between the reference to a and a new copy of the value (I 4).

One package that helps wrap up this functionality is data-reify.

The actual variable names used in your source will be irretrievably lost during compilation. In Paradise we use a preprocessor to translate foo <~ bar into foo <- withName "foo" $ bar before compilation, but it's hacky and it slows down builds quite a bit.

Thank you for your answer! I had read about *observable sharing* on the haskell wiki, but hadn't quite understood it. I guess nothing does this like experiencing a problem first hand ;). I also remember skimming the paradise paper, I think I'll have to give it some more time to do it justice. — Paul, Feb 16 '13 at 09:18

score 4 · Answer 2 · edited May 23 '17 at 12:22

I figured this out based on @Gabriel Gonzales' linked answer. The basic idea is to introduce a new variable constructor in the Expr type and you assign these a unique id as you interpret the tree. That and cleaning up the code a bit gives:

import Control.Monad.Free
import Data.Map

newtype VInt = VInt Int

data Expr = IntL Int
          | IntV VInt
          | Plus Expr Expr

instance Show Expr where
  show (IntL i)        = show i
  show (IntV (VInt i)) = "var" ++ show i
  show (Plus e1 e2)    = show e1 ++ " + " ++ show e2

data StackProgF next = Pop  (VInt -> next)
                     | Push Expr next

instance Functor StackProgF where
  fmap f (Pop    k) = Pop (f.k)
  fmap f (Push e x) = Push e (f x)

type StackProg = Free StackProgF
type Stack = [Expr]

push :: Expr -> StackProg ()
push e = liftF $ Push e ()

pop :: StackProg Expr
pop = liftF $ Pop IntV

prog3 :: StackProg Expr
prog3 = do
  push (IntL 3)
  push (IntL 4)
  a <- pop
  b <- pop
  return (Plus a b)

showSP :: StackProg Expr -> String
showSP prg = go 0 prg []
  where
    go i (Pure a)          _     = show a
    go i (Free (Pop n))    (h:t) = "var" ++ show i ++ " <- pop " ++ show (h:t) ++ "\n" ++ 
                                   go (i+1) (n (VInt i)) t
    go i (Free (Pop _))    []    = "error: pop on empty stack\n"
    go i (Free (Push e n)) stk   = "push " ++ show e ++ ", " ++ show stk ++ "\n" ++ go i n (e:stk)

type Env = Map Int Expr

evalExpr :: Expr -> Env -> Int
evalExpr (IntL i)        _   = i
evalExpr (IntV (VInt k)) env = evalExpr (env ! k) env
evalExpr (Plus e1 e2)    env = evalExpr e1 env + evalExpr e2 env

evalSP :: StackProg Expr -> Int
evalSP prg = go 0 prg [] empty
  where
    go i (Free (Pop _))    []    env = error "pop on empty stack\n"    
    go i (Free (Pop n))    (h:t) env = go (i+1) (n (VInt i)) t       (insert i h env)
    go i (Free (Push e n)) stk   env = go i     n            (e:stk) env
    go i (Pure a)          _stk  env = evalExpr a env

Pretty printing and running:

*Main> putStrLn $ showSP prog3
push 3, []
push 4, [3]
var0 <- pop [4,3]
var1 <- pop [3]
var0 + var1
*Main> evalSP prog3
7

Printing an AST with variable names

2 Answers2

Linked