1

I'd like to use literal haskell functions in a dsl as follows:

program :: forall m . DSL m => m ()
program = do
  stm $ Stm (Var "a")
  -- this way:
  f <- fun $ \(a :: Expr a) (b :: Expr b) -> do
    -- function body is of the same monad `m` as the one in top level
    stm $ Stm (Var "b")
  stm $ Stm $ Apply f (Var "c")
  pure ()

.. and turn the above into code ([Stm]).

To make the functions work there is a helper class Fun (code below) that saturates the function with fresh variables until it reaches function body, then returns both the list of arguments and body itself so that they could be made into function syntax[1]. There is a recursive case to generate arguments[2] and a base case to evaluate body to [Stm].[3]

The worrisome instance is the base case[4], which currently requires {-# INCOHERENT #-}, because ghc can't choose a Fun instance because function's body in the program is ambiguous (while it should just be whatever m is at top level code).

So the question is:

  • is there a way to force the body to always use the same m as in top level and not require incoherent instances?

A stub AST, the MTL-style DSL "effect" and a sample instance of the class:

-- | An AST
data Expr a
  = Var String
  | Apply String (Expr a)
  | Function String [String] [Stm]

data Stm = Stm (Expr ())

-- | The "effect"
class Monad m => DSL m where
  freshName :: m String -- generate fresh variable name
  stm :: Stm -> m () -- emit statement
  toAST :: m a -> m [Stm] -- turn code `m a` into [Stm] (without emitting it)
  fun :: Fun f m => f -> m String -- emit function f, return its name

-- | Helper class to convert literal haskell functions to the dsl
class Fun f m where
  mkFun :: DSL m => f -> [String] -> m ([String], [Stm])

instance Fun f m => Fun (Expr a -> f) m where
  mkFun f acc = do
    name <- freshName -- [2]
    mkFun (f $ Var name) (name : acc)

instance {-# INCOHERENT #-} (m0 ~ m) => Fun (m0 a) m where -- [4]
  mkFun m args = do
    fname <- freshName
    body <- toAST m -- [3]
    return (args, body)

-- | A sample implementation
instance DSL (StateT Int (Writer [Stm])) where
  freshName = do
    n <- get
    put $ n + 1
    return $ "var" <> show n
  stm stm' = tell [stm']
  toAST m = do
    state0 <- get
    let ((_, state1), w) = run m state0
    put state1
    return w
  fun f = do
    (args, body) <- mkFun f []
    name <- freshName
    stm $ Stm $ Function name args body -- [1]
    return name

run :: StateT Int (Writer [Stm]) a -> Int -> ((a, Int), [Stm])
run m s = runWriter $ runStateT m s
  • "*is there a way to force the body to always use the same m as in top level and not require incoherent instances?*" - The thing is, the compiler doesn't know what the top level `m` is - whether it's a `Expr a -> f` or something else. It just knows that it is some `m`. So it can't possibly pick an instance without knowing what the type variable `m` is - unless you help it by using the `{-# INCOHERENT #-}` pragma. As far as I know, there's no alternative. Read [this](https://stackoverflow.com/a/8372400/17592995) answer if you want more detail on what the incoherent pragma does. – Vikstapolis Feb 11 '22 at 14:31
  • @Vikstapolis Right, but doesn't the same apply for `program :: DSL m => m ()` where the m is ambiguous (can't choose an instance) until it's used in some monomorphic context? I.e I guess my problem is: how can I get ghc to pick a Fun instance at the time the DSL m is picked -- as when at that time m is monomorphic (e.g an `StateT Int (Writer [Stml])`) it can safely choose the base case Fun as it definitely won't match the `Expr a -> f` head. – Cigarette Smoking Man Feb 11 '22 at 14:53
  • 1
    Would it be acceptable for all your base DSL's to take an extra (phantom) type parameter? Perhaps you can distinguish between `(->)` and `m` based on *kind*, as in `instance (m0 ~ m, p ~ '(), p' ~ '()) => Fun (m0 p a) (m p')`. – Daniel Wagner Feb 11 '22 at 15:05
  • @DanielWagner Absolutely (especially, if there is no other way :)). I guess the order of fixes would be to avoid per-use-site things first (e.g using the `m` from program's type signature and type-applying it, or using a value to fix function body as in `f <- fun $ \(a :: Expr a) (b :: Expr b) -> body $ do` where `body` does something) – Cigarette Smoking Man Feb 11 '22 at 15:15
  • I'm confused about how this is supposed to work after you solve your immediate problem. It looks like you expect to be able to write `Var "b"` to refer to the Haskell variable `b` serving as the second parameter in your literal function, but its `freshName` will be `var`, not `b`. – K. A. Buhr Feb 11 '22 at 15:47
  • @K.A.Buhr that is true, both the AST and the program are just stubs (wanted to show how one would have statements in and around the function definition). – Cigarette Smoking Man Feb 11 '22 at 16:01

1 Answers1

0

I guess one way is to mark the base case of the recursion yourself.

newtype Cook m a = Cook (m a)
deriving instance DSL m => DSL (Cook m)

instance Fun f m => Fun (Expr a -> f) m
instance m0 ~ m => Fun (Cook m0 a) (Cook m)

type SampleImplRaw = StateT Int (Writer [Stm])
type SampleImpl = Cook SampleImplRaw

instance DSL SampleImpl

I haven't typed this into the compiler, so I'm not sure, but it may be that you get into a funny situation where you want a DSL m instance, and have a DSL (Cook m) instance in scope, but the compiler can't figure out how to get from A to B. In such a case it should be possible to coerce the offending term to use the Cook m instance.

Another way might be to try making the base case have a different kind. Something like this:

instance Fun f m => Fun (Expr a -> f) m
instance (m ~ m', p ~ p', p ~ '()) => Fun (m p a) (m' p')

data SampleImplRaw (x :: ()) a = {- whatever you would have written before for new data declarations -}
type SampleImpl = SampleImplRaw '()
newtype Const1 (x :: ()) m a = Const1 (m a) deriving (EVERYTHING) -- still need a cook for mere monad transformer stacks

I guess probably if your plan is mostly to use transformer stacks, the first one will be less roundabout; if your plan is mostly to define new data types, the second will have a cleaner user experience.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • Tried `Cook` version here https://gist.github.com/eyeinsky/83cbca3fe21a86799db78630b6dadd46, I got rid of the incoherence, but it added another issue marked with `ISSUE`. (It could be, that the code is not exactly what you meant) – Cigarette Smoking Man Feb 12 '22 at 18:37
  • @CigaretteSmokingMan Well, you've already got `QuantifiedConstraints` turned on for some reason (you don't seem to be using it yet, though?), so you can write `program :: forall m. (DSL m, forall a. Fun (m a) m) => m ()`. – Daniel Wagner Feb 13 '22 at 05:11
  • Thank you!! Do you think there is any chance in getting rid of the `@m`s in the function bodies or is this the inherent requirement of this kind of polymorphism? – Cigarette Smoking Man Feb 13 '22 at 13:20
  • 1
    @CigaretteSmokingMan Hm. I guess you can reduce it to once per `do` block instead of once per polymorphic thing inside by writing `cooked :: Cook m a -> Cook m a; cooked = id` and then using `cooked $ do` instead of `do`. (Using `QualifiedDo` I guess you could shorten it to something like `C.do`.) But I don't see an obvious way to reduce the annotations further than that. At some point, you have to tell the compiler where to bottom out. – Daniel Wagner Feb 13 '22 at 17:05
  • I tried the different kind approach here https://gist.github.com/eyeinsky/a6244e7ae88e4f20751af23326b2074e with the `ISSUE` (if I didn't botch anything up with implementing it). I'm beginning to think that the issue is not the overlap in instances, but that the entire construction with DSL/Fun classes only "allows" what I want to achieve, but doesn't require it. I.e if the inner monad would be *required* to be of the outer monad's type then the compiler would infer if the inner expression fits that type (it does because adding just the type application makes it compile). – Cigarette Smoking Man Feb 15 '22 at 17:21