Using Pest.rs, how can I allow non-significant whitespace after a keyword?

Question

I have the following Pest (https://pest.rs) grammar

name = {ASCII_ALPHA+}
fragment = { "fragment" ~ name }

When I try to parse fragment name using this I get this error:

 --> 1:9
  |
1 | fragment name
  |         ^---
  |
  = expected name

Teymour · Accepted Answer · 2022-08-22T15:47:35.497

This problem is caused by Pest's handling of whitespace. Note first that the above works if an additional space is added to the fragment keyword.

name = {ASCII_ALPHA+}
fragment = { "fragment " ~ name }

This isn't the most ideal solution, however, given that this only handles one space. In my specific use case it's fine for the user to place as many spaces as they wish between the fragment keyword and the name. It turns out that there's already a solution for this.

From the documentation:

using the special rule WHITESPACE. If defined, it will be implicitly run, as many times as possible, at every tilde ~ and between every repetition (for example, * and +).

Upon initial reading I missed the important part of that sentence: "If defined".

This is an optional thing which has to be enabled, only once (defining it multiple times, as with all Pest rules is invalid), in the grammar.

WHITESPACE = _{ " " }
name = {ASCII_ALPHA+}
fragment = { "fragment " ~ name }

Note that it's important that the leading underscore is there _{ " " } because this marks it as an implicit rule, so it will not show up as a Pair when you manipulate it from a Rust program.

Since the documentation says "at every tilde ~" it seems like the extra space in `"fragment "` is no longer needed once you've defined the `WHITESPACE` rule. (I haven't used Pest, perhaps someone who has can confirm) — trent, Mar 22 '21 at 13:28
The space in this case ensures that there is at least one space directly following the `fragment` keyword. — coriolinus, Jun 27 '22 at 12:18

Using Pest.rs, how can I allow non-significant whitespace after a keyword?

1 Answers1