Exit condition for AST with DFS

Question

I'm trying to write a general purpose AST builder with a DFS approach.

The principle of the grammars is rather simple: An expression can have one or more options, each option consists of a list of nodes where each node can be either a terminal or an expression.

I have a state that works-ish but and am struggling with finding an exit condition for "endless loops".

If I define the grammar like this:

expr := addExpr;

addExpr := NUMBER
         | NUMBER OPERATOR expr
         ;

It passes through. The structure is semantically correct, but will produce wrong results when traversed. "4 - 2 + 2" gets evaluated as "4 - (2 + 2)" instead of the desired "(4 - 2) + 2".

I would need to define the grammar like this:

expr := addExpr;

addExpr := NUMBER
         | expr OPERATOR NUMBER
         ;

The problem here is, that it leads to an endless iteration:

expr -> addExpr -> expr -> addExpr -> ...

This is where I'm stuck. I cannot come up with an exit condition as to when to stop trying. I was thinking of having a history, if I compared the same expression against the remaining tokens, I can stop. But this stops too early and won't produce any result.

I guess the stack I need would kind of look like this:

expr (against 4 - 2 + 2)
  addExpr (against 4 - 2 + 2)
    0: NUMBER -> matches 4, won't finish.
    1: expr OPERATOR NUMBER
      expr (against 4 - 2 + 2) -> I cannot break yet.
        addExpr (against 4 - 2 + 2) -> I cannot break yet.
          0: NUMBER -> matches 4, won't finish.
          1: expr OPERATOR NUMBER
            expr (against 4 - 2 + 2) -> I cannot break yet.
              addExpr (against 4 - 2 + 2) -> I cannot break yet.
                0: NUMBER -> matches 4, will resolve in the end.
                1: expr OPERATOR NUMBER
                  expr (against 4 - 2 + 2) -> BREAK!

It seems kind of arbitrary when to break. Is there some good rule or condition to achieve this?

I don't really want to just set a "random" max-depth if I can avoid it. Maybe I should consider a BFS approach (whole different set of problems), but I'd like to stick with DFS if it's possible.

The key here is to iterate rather than recurse. See my answer on how to write recursive descent parsers: https://stackoverflow.com/a/2336769/120163 — Ira Baxter, Mar 30 '19 at 16:17
Alternatively, build the tree bottom-up (that is, do a postorder traverse rather than a preorder traverse). — rici, Mar 30 '19 at 19:38
Thank you @IraBaxter for your answer. I am aware of the link you posted. My problem with it is, that I'm not interested in writing a parser for a specific grammar. I wanted to build a generic grammar parser - something like ANTLR. But maybe that's the answer? I cannot just write a DSF search but need to find a way to generate the AST builder functions? — Cabadath, Mar 31 '19 at 00:37
Interesting @rici. I'm not yet sure how to fit postorder into my algorithm, I'll certainly give it a try. — Cabadath, Mar 31 '19 at 00:39
Then you should have a close look at MetaII, which generate recursive descent parsers from grammars. And you can implement this a few mind-blowing evenings. See https://stackoverflow.com/a/5739338/120163. Extentions of MetaII called "Tree Meta" (google that) will let you build grammars with integrated tree building, and then let you walk those trees and emit code. — Ira Baxter, Mar 31 '19 at 18:21

Exit condition for AST with DFS

0 Answers0