1

I want to work on java grammar for tree-sitter which seems to be abandoned for a few months. It would be nice to use property-based testing so I thought about whether there are tools which can take a given grammar in some way and be used to generate random AST and code from it.

So the property would look like

data OtherValidJavaAst = undefined
data TreeSitterAst = undefined

transform : TreeSitterAst -> OtherValidJavaAst

genAst : Gen OtherValidAst

genCode : OtherValidAst -> String

parseTreeSitter : String -> TreeSitterAst

parsesEqually : ValidJavaAst -> Boolean
parsesEqually ast = transform (parseTreeSitter (genCode ast)) == ast
user1685095
  • 5,787
  • 9
  • 51
  • 100

1 Answers1

2

What you are after seems to be sentence generation for a given language. A piece of Java code is a sentence of the entire Java language, as an example. However, due to recursions and loops the number of valid sentences for a given grammar is actually unlimited, even for very basic grammars. Hence this is a tricky thing to do and I'm not aware of a tool doing that, except for one I wrote myself (as part of my vscode ANTLR4 extension), which is however still in development.

What you can do is, however, to limit the generation process to a subset of the full language, by limiting recursions and iterations. However an important aspect here is: what is a good representation of the language?

Another point here is also: you cannot use sentences generated from a grammar to test this grammar, as this would always succeed, since the sentences were generated from that grammar and hence must be valid.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181
  • I'm not trying to test grammar by generating sentences from it. I'm trying to test one parser by comparing it to another. And of course I expect that the sentence generation should be programmable. Obviously I don't need infinite sentences. – user1685095 Jan 13 '19 at 09:56