1

I'm writing an interpreter for a grammar parser generated with TatSu. I'm looking for a convenient way to generate use cases for my grammar, so I can write unit tests for my interpreter. Currently, I'm generating my test cases by hand.

I wonder if the TatSu package does provide any (maybe undocumented) means to auto-generate random grammar derivations so I could use them as test cases for my interpreter. In addition, it would be desirable to specify the grammar rule, for which I need the random productions.

apio
  • 154
  • 11
  • 1
    Maybe you're looking for random input generated from the grammar (derivations)? Random productions sounds very strange. – Apalala Oct 27 '21 at 18:03
  • Yes, sorry, like in the title, I'm looking for "expansions" (=derivations), not to be confused with grammar productions (rules). – apio Oct 27 '21 at 18:40
  • A way to solve my problem would be to implement a TatSu semantics class for a TatSu parser compiled with its own TatSu grammar that would generate collections of all option, choice, and sequence rules, as well as terminals in a given grammar. Such a class could then be used for a random generation of derivations of a given grammar. I'm asking myself if there are internals in the existing TatSu package (5.6.1) that provide these collections already. – apio Oct 28 '21 at 07:28
  • So first, amend your question so it says "derivations". After that, there are languages that derive languages that are infinite. But I have some ideas about how to generate some derivations in a minimal way. – Apalala Oct 29 '21 at 18:44
  • I amended the question as you wished. I think the TatSu internals I was looking for is the structure of the "rulemap" object in a compiled parser. I also have already ideas on how to utilize it for my purpose and I'm working already on a proof of concept. If it works, I will post it as a possible answer. Yes, infinite derivations are possible, and the algorithm would need some strategies to avoid them. – apio Oct 30 '21 at 07:38

2 Answers2

1

If you look at the __str__() method in grammars.py you'll see an example of walking through a grammar to transform it into something readable.

You could also use a Visitor.

Because the set of derivations for a grammar is potentially infinite, you need a strategy to generate some interesting samples before quitting (Ctrl-C):

  • go breadth-first, or the visitor will recurse until the runtime stack is exhausted
  • because of PEG, work first with the last option in a choice (|), which should be the one producing the shortest derivation

Because TatSu skips over whitespace, you'll probably need to add a step to pretty-print the output.

This is an interesting project, and it would be good if at the end you added it as a pull request to TatSu.

My apologies for providing only guidelines instead of an example.

Apalala
  • 9,017
  • 3
  • 30
  • 48
  • Thx, I've started working on a solution based on the rulemap attribute of a compiled tatsu parser. I appreciate your suggestion about using the last option in a choice. If I succeed. I will create a pull request. There is a 'grammars.py' in the tatsu package, but no 'grammar.py'. Its __str__() method is probably not the one you mean. – apio Oct 31 '21 at 15:06
  • It is indeed `tatsu/grammars.py`. – Apalala Oct 31 '21 at 18:50
0

I've created an experimental public repository TatSu Random Derivation Generator that can generate random derivations for many rules of your grammars compiled with TatSu. If the grammars are very complex, the program runs into a RecursionError.

Nevertheless, it is useful for testing your grammars, especially if you want to test derivations for specific production rules.

The example.py as well as the many tests show you how to use the tool.

apio
  • 154
  • 11