5

I was wondering how do you parse comments (say, a la Haskell), in pegjs.

The goal:

{-
    This is a comment and should parse.
    Comments start with {- and end with -}.
    If you've noticed, I still included {- and -} in the comment.
    This means that comments should also nest
    {- even {- to -} arbitrary -} levels
    But they should be balanced
-}

For example, the following should not parse:

{- I am an unbalanced -} comment -}

But you should also have an escape mechanism:

{- I can escape comment \{- characters like this \-} -}

This sorta seems like parsing s-expressions, but with s-expressions, it's easy:

sExpression = "(" [^)]* ")"

Because the close parens is just one character and I can "not" it with the carrot. As an aside, I'm wondering how one can "not" something that is longer than a single character in pegjs.

Thanks for your help.

Hassan Hayat
  • 1,056
  • 8
  • 20

1 Answers1

7

This doesn't handle your escape mechanisms, but it should get you started (here's a link to see it live: pegedit; just click Build Parser and Parse at the top of screen.

start = comment

comment = COMSTART (not_com/comment)* COMSTOP

not_com = (!COMSTOP !COMSTART.)

COMSTART = '{-'

COMSTOP = '-}'

To answer your general question:

As an aside, I'm wondering how one can "not" something that is longer than a single character in pegjs.

The simple way is (!rulename .) where rulename is a another rule defined in your grammar. The ! rulename part just ensures that whatever's scanned next doesn't match rulename, but you still have to define something for the rule to match, which is why I included the ..

Community
  • 1
  • 1
bekroogle
  • 96
  • 4
  • 1
    Cool! Thanks. That helped a lot. I had seen the ! operator but I thought that it just didn't work. I know I understand that I actually had to include something for it to parse. – Hassan Hayat Feb 15 '15 at 03:30