8

I'm trying to create a parser with Superpower. I have already taken a look to the samples I've found in the repo, but they are a bit difficult to understand, at least for a beginner like me :) So I came with this little challenge.

I have invented a very basic grammar just to learn. I have thought of an elevator that follows a list of instructions to go up, down and wait.

Example:

(UP 100),
(DOWN 200),
(DOWN 100),
(DOWN @1),
(UP @3),
(WAIT),
(UP 300)

As you see, it consists of a list of comma-separated verbs to move, for example, an elevator.

  • The verbs are UP, DOWN or WAIT.
  • Every verb is enclosed by parentheses: ( )
  • UP and DOWN require either an absolute number or a relative number, that indicates the floor to which the elevator should move. Relative floor numbers come with a @ before the number.
  • WAIT doesn't accept any number, because it stops the elevator for a while.

I really would like to learn how to create a token based parser for this grammar as a start in order to understand how to use SuperPower.

halfer
  • 19,824
  • 17
  • 99
  • 186
SuperJMN
  • 13,110
  • 16
  • 86
  • 185
  • [this](https://github.com/datalust/superpower/tree/dev/sample/DateTimeTextParser)? there is not much documentation beyond a [readme](https://github.com/datalust/superpower/blob/dev/README.md), but they do have some samples, and tests which show how grammars are built. – Cee McSharpface Dec 10 '17 at 16:57
  • @PatrickArtner Thanks for making StackOverflow a greater place. – SuperJMN Dec 10 '17 at 17:05
  • @dlatikay Thanks, I know those samples, but still I cannot handle it. That's why I created this silly sample: just for the sake of learning. If you are capable of creating the parser I asked for with the existing docs, please, share. – SuperJMN Dec 10 '17 at 17:10
  • @PatrickArtner Since it's a new tag for a relatively unknown library, it's not targeted to 99.9% of the users in StackOverflow as today, so it's meant to be answered by a reduced group of users (experts in the matter) who, by answering this question in SO will help future users solve a common scenario. Also, I consider that anyone that is using the library has more than enough context to answer. BTW, the creators of the lib point to this tag (https://github.com/datalust/superpower#getting-help). – SuperJMN Dec 10 '17 at 17:47
  • @PatrickArtner You can be sure that I have already pinged the people in their Gitter room, to get the attention of the people using the library. StackOverflow is currently a hub for many communities, even for those who are so new that almost nobody know about them. Let them do. If you don't know about a topic, just skip the question, like most of us do. – SuperJMN Dec 10 '17 at 17:52

2 Answers2

14

Step 1 of writing any Superpower parser is to figure out what the token kinds are. You have something like:

// ECL - Elevator Control Language ;-)
enum EclToken {
    LParen,
    RParen,
    UpKeyword,
    DownKeyword,
    WaitKeyword,
    AtSymbol,
    Number,
    Comma
}

Step 2, write a Tokenizer<EclToken>. This is left as a direct programming task by Superpower v1 - there aren't many helpers to lean on, you just need to write the code as in the examples.

The tokenizer takes the input string, strips out the whitespace, and figures out what the sequence of tokens is.

For your example input, the first line will be:

// (UP 100),
LParen, UpKeyword, Number, RParen, Comma

For tokens like Number that contain content, the span associated with the Result<EclToken> will point to the portion of the input string corresponding to the token. In this line, the number will be a TextSpan covering 100.

Step 3 is to figure out what you want to parse the input into. For programming languages with nested expressions, this is usually an AST. In the case of the ECL sample, it's pretty simple so you might cut it down to:

struct ElevatorCommand {        
    public int Distance; // + or -
    public bool IsRelative;
}

Step 4, the parser. This is usually embedded in a static class. The job of the parser is to build up more complex results (an ElevatorCommand[], here), from simpler results (number, movement).

This is where Superpower does the heavy lifting, particularly with respect to expectations and errors.

static class EclParser 
{
    static TokenListParser<EclToken, int> Number =
        Token.EqualTo(EclToken.Number).Apply(Numerics.IntegerInt32);
}

The first thing we do is define the parser for numbers; this one applies a built-in TextParser<int> to the content of an EclToken.Number span.

You can see more parsing machinery in this example.

A few more clues to help you find the way (not syntax checked, let alone compiled/tested):

    static TokenListParser<EclToken, ElevatorCommand> Up =
        from _ in Token.EqualTo(EclToken.UpKeyword)
        from distance in Number
        select new ElevatorCommand {
            Distance = distance,
            IsRelative = false
        };

    static TokenListParser<EclToken, ElevatorCommand> Command =
        from lp in Token.EqualTo(EclToken.LParen)
        from command in Up // .Or(Down).Or(Wait)
        from rp in Token.EqualTo(EclToken.RParen)
        select command;

    static TokenListParser<EclToken, ElevatorCommand[]> Commands =
        Command.ManyDelimitedBy(Token.EqualTo(EclToken.Comma));
}

Commands is the completed parser that you can apply to the input.

It's best to build up the parser incrementally, testing each smaller parser on chunks of input they're expected to parse.

Nicholas Blumhardt
  • 30,271
  • 4
  • 90
  • 101
  • OK, I think I understand. Right now, I'm trying to create the parser :) For now, I have a doubt. You modelled the ElevatorCommand only with the distance and whether it's a relative command or not. But how would a WAIT command be represented? – SuperJMN Dec 11 '17 at 23:30
  • 1
    Just a heads-up from nitpickers' corner, the `p` in `Superpower` is lowercase :-) – Nicholas Blumhardt Dec 12 '17 at 21:00
  • 1
    Cool! Sorry, I believe it's my PascalCaseMania :-P – SuperJMN Dec 12 '17 at 22:10
3

OK, I have finally managed to get it. It wasn't so difficult with @Nicholas Blumhardt's guidance :)

I have created a project in GitHub to illustrate the scenario. Since the classes are big for a post, I'm linking to the files:

SuperJMN
  • 13,110
  • 16
  • 86
  • 185