20

I am trying to learn a bit of Template Haskell and Quasi Quotation, and I am looking for a function that takes a String and parses it to Q Exp, so the type is:

String -> Q Exp

Tried searching hoogle, but the results I saw had to do with lifting String literals to Q Exp, and the closest I found was Language.Haskell.TH.dyn which does quite what I want, but only for a single variable.

Are there other options? E.g. a special syntax? I'm just in the process of familiarizing myself with [||] and $(), so maybe there is something for this purpose too?

An example of how I imagine it would work:

runQ (parse "(1+)") == InfixE (Just (LitE (IntegerL 1))) (VarE GHC.Num.+) Nothing

Also, I am aware of this

runQ [| (1+) |] == InfixE (Just (LitE (IntegerL 1))) (VarE GHC.Num.+) Nothing

but this wont work with variable strings because -- understandably -- the string inside is taken as a literal.

runQ [| "(1+)" |] == LitE (StringL "(1+)")

Edit (2015-07-25): I've started using haskell-src-meta, and it seems to work well so far. However it does take quite a bit of time to cabal install (about 10 minutes on my machine). Which is a shame, my package is actually rather small, and I would like if install could be quick. Anyone knows of a solution that has smaller dependencies?

Wizek
  • 4,854
  • 2
  • 25
  • 52
  • 3
    I believe [haskell-src-meta](https://hackage.haskell.org/package/haskell-src-meta) provides this – luqui Jul 14 '15 at 16:42
  • 2
    @luqui I am a little confused by that package. It writes something is "not 100% complete yet" in the description. Shouldn't this functionality be already present within GHC? It must be, because it takes `[|(1+)|]` and is perfectly capable to turn that into `(InfixE _)`. So why is there a need for a third party package that may or may not parse correctly? Or am I misinterpreting and that is the canonical code GHC uses too? Or GHC just doesn't expose this function at all perhaps? I would be grateful for some clarity around this. :) – Wizek Jul 14 '15 at 21:16
  • 1
    AFAIU GHC does not expose this code, but I am no TH expert – luqui Jul 14 '15 at 21:24
  • 1
    @luqui if that is the case, wouldn't it be much more elegant to expose it under -- say, for instance `Language.Haskell.TH.Parser.parse :: String -> Q Exp`? – Wizek Jul 14 '15 at 21:31
  • 1
    It would be *way* more elegant if GHC exposed it, but it can't have that type. It needs a type that includes all of the possible options that can affect parsing. When it gets there, you end up having circular dependency issues between the `ghc` and `template-haskell` packages. – Carl Jul 14 '15 at 22:43
  • Hmm. But isn't it strange that there are third party, *duplicate* solutions to a functionality that is built right in to GHC? When you say circular dependencies, do you mean that it would be impossible or prohibitively difficult to expose this one single function? Also, thanks for your point about the parsing-modifier options. Fair point. A record or list passed in would be sufficient for that, right? – Wizek Jul 15 '15 at 01:12
  • If this `parse` function existed, what would it's type be? – AndrewC Jul 15 '15 at 07:21
  • 1
    I asked almost this exact question a while ago, and it seems that no, TH doesn't have this obviously useful functionality. Which just seems weird to me... – MathematicalOrchid Jul 15 '15 at 09:07
  • @luqui I've started using `haskell-src-meta`, and it seems to work well so far. Want to add it as an answer so I can accept it for now? – Wizek Jul 25 '15 at 10:46
  • You _can_ create such a function, since the Q Monad has a means of running IO actions, so you could pass the information via some global or thread-local state. But it would be very un-ideomatic. – Demi Aug 29 '16 at 18:15
  • As for me, I still don't get how the template would be spliced. I thought splices can be performed during compilation only. – arrowd Oct 17 '16 at 07:22

1 Answers1

5

As everyone has already said haskell-src-meta provides

parsePat :: String -> Either String Pat
parseExp :: String -> Either String Exp
parseType :: String -> Either String Type
parseDecs :: String -> Either String [Dec]

where Pat, Exp, Type, and Dec are the same as from Language.Haskell.TH.Syntax.


Why doesn't GHC expose its own parser?

It does. Fire up GHCi with ghci -package ghc (ghc is a hidden package by default) and you can import the Parser. It has functions to parse String into preliminary ASTs (whose data declarations are in HsSyn) for patterns, expressions, types, and declarations.

OK, then why does there not exist a library that uses this parser and converts its output to be the AST from template-haskell (the one in Language.Haskell.TH.Syntax)?

Looking inside HsSyn, its obvious that the AST isn't quite the same as the one in Language.Haskell.TH.Syntax. Open up both HsExpr and Exp and side by side you'll see that the latter is filled with types like PostTc id <some-other-type> and PostRn id <some-other-type>. As the AST is passed from the parser to the renamer to the type checker, these bits and pieces are all slowly filled in. For example, we don't even know the fixities of operators until we get to type-checking!

In order to make the functions we want, we would need to run much more than just the parser (at least the renamer and type checker too, maybe more). Imagine that: every time you want to parse even a small expression like "1 + 2" you'll still have to type check a bunch of imports. Even then, converting back to the Language.Haskell.TH.Syntax wouldn't be a walk in the park: GHC has a variety of peculiarities like its own special global way of storing names and identifiers.

Hmmm... but what does GHC do with quasi-quotes?

That's the cool part! Unlike Exp, HsExpr has HsSplice for representing splices. Look at the types for the first two constructors:

HsTypedSplice :: id -> LHsExpr id -> HsSplice id.   -- things like [|| 1 + 2 ||]
HsUntypedSplice :: id -> LHsExpr id -> HsSplice id  -- things like [| 1 + 2 |]

Notice that they aren't storing String, they are storing an AST already! Splices get parsed at the same time as the rest of the AST. And just like the rest of the AST, the splices will get passed along to the renamer, type checker, etc. where missing information will be filled in.

So is it fundamentally impossible to use GHC's parser

Probably not. But extricating it from the rest of GHC may be quite difficult. If to use GHC's parser we have to also run the type-checker and the renamer, it may be more elegant and simple to just use a standalone parser like haskell-src-exts (which is what Haskell-src-meta depends on) that is able to do everything in one pass (fixities, for example, are one of the things you have to give ahead of time to this parser).

Alec
  • 31,829
  • 7
  • 67
  • 114
  • 1
    Why would fixity not be known till type checking? Isn't that conceptually a layer over the parser to produce a more complete AST? Why would fancy stuff happen before that? – dfeuer Dec 06 '16 at 02:54
  • @dfeuer Fixities are filled in during the renaming phase (which occurs right before type checking), so type-checking is the first phase where fixities are fully present. I'm probably not understanding your question... – Alec Dec 06 '16 at 03:18
  • I thought the renamer was tied up with the type checker; maybe that was wrong. – dfeuer Dec 06 '16 at 06:01
  • @dfeuer No you understood correctly - the renamer and the type checker do feed back into each other, but only when dealing with top-level splices. The point remains that fixities for a particular block aren't even known until the end of renaming (and consequently the AST related to infix operators [may need to be rearranged](https://downloads.haskell.org/~ghc/8.0.1/docs/html/libraries/ghc-8.0.1/src/RnTypes.html#mkOpAppRn)). – Alec Dec 06 '16 at 06:35
  • I'm stumped with the exact same problem at https://stackoverflow.com/questions/45674757/how-to-splice-in-literal-strings-of-haskell-code-via-template-haskell How does TH work? Is the AST converted to literal source code, which is then fed into the "regular GHC", or does GHC use the AST directly? If it's the former, then isn't haskell-src-meta a fairly inefficient solution where the same code is being parsed twice -- once by haskell-src-meta and once by actual GHC? – Saurabh Nanda Aug 14 '17 at 13:09
  • @SaurabhNanda The AST inside the splice is parsed by GHC at the same time as the AST outside the splice. Critically, this all happens at compile time. You can _never_ feed run-time strings into quasi quotes (or GHC), but you can feed run-time strings to `haskell-src-meta`. – Alec Aug 14 '17 at 21:32