10

The following grammar has left recursion

E= E+T|T
T= T*F|F
F= a|b|c

How to remove it? Is there any general procedure for it?

Prasoon Saurav
  • 91,295
  • 49
  • 239
  • 345

4 Answers4

15

Yes, there is a general procedure, see e.g. wikipedia.

E = TE'
E'= (e) | +TE'
T = FT'
T'= (e) | *FT'
F = a | b | c

// (e) is "epsilon" which is defined as the empty string

It should be mentioned that this alters the associativity of + and * from left to right. That is, where before a + b + c was parsed as (a + b) + c, it is now parsed as a + (b + c).

This is not a problem for addition and multiplication, but it would be a problem for subtraction and division, for example.

Wikipedia article has more detail on the left-recursion removal procedure and its intricacies.

Matt Mc
  • 8,882
  • 6
  • 53
  • 89
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • 5
    So how to resolve that problem? (of reversing the associativity) I see many copies (direct of indirect) of this algorithm of removing left recursion, and they always write the warning about breaking associativity, but I hadn't seen yet any solution to that problem in any of such articles. Is there any solution? Or only the problem? And the other question is: Why does this work? – SasQ Feb 19 '12 at 18:34
  • 2
    I ask because I haven't seen any reliable PROOF of that algorithm. There were "proofs" which showed that the language obtained from the transformed grammar is still the same, but for me there's an ERROR in such proofs: an assumption that if the syntax generates the same output, it's the same. Because it's NOT! It's a whole different language, because it has a different parse tree, so it's "understood" differently by the machine. The same difference as between interpreting "Abstract Syntax Tree" as "Abstract (Syntax Tree)" and "(Abstract Syntax) Tree" - two different things. – SasQ Feb 27 '12 at 01:57
  • 1
    The same goes for left-recursive and right-recursive syntaxes. One is left-associative, the other is right-associative. And I don't know about any way to have left-recursive syntax with right-associativity. Do you know? – SasQ Feb 27 '12 at 01:59
  • In math (which I consider as a superset of computer science), we are careful to define "same," "similar," "equal," "equivalent," "isomorphic," etc. before we use the terms. Languages are *sets* of strings, which are finite ordered sequences of symbols from an alphabet. Set-equality is well-defined. We say two sets `A` and `B` are equal if `A` is a subset of `B` and `B` is a subset of `A`. And we can show that this is the case, apparently, with these algorithms. Thus two grammars which generate different parse trees can be rigorously shown to describe equal languages, in terms of the sets. – Jesus is Lord Jul 25 '12 at 14:39
  • If you wish to say that: `A -> aA | @` is not the same as `A -> Aa | @` then by all means do so. You have the liberty to define your terms as you see fit. Give me a rigorous, precise definition of "sameness," though, when it comes to grammars, though. I suspect your notion may be something along the lines of: "Two grammars are structurally isomorphic if there is a bijection of parse trees for each element of the language. Two parse trees are isomorphic if their naturally-defined directed graphs are isomorpic." This is a beast to do anything with, but start doing proofs with something – Jesus is Lord Jul 25 '12 at 14:44
  • 1
    like this and I'll be interested :) – Jesus is Lord Jul 25 '12 at 14:45
  • 1
    This answer could be improved by explaining `e`. Without this, it's pretty opaque to those who aren't already familiar with parser theory. (Sure, you provided a link to a Wikipedia page, but said page begins with "This article may be too technical for most readers to understand", which is certainly the case for me). – weberc2 Sep 08 '17 at 18:20
3

The general procedure is found at Wikipedia, for example. I'll demonstrate for E = E+T|T:

E is the left-recursive non-terminal. +T is the non-null sequence (alpha). T is the sequence which doesn't start with E (beta).

We create a new nonterminal for the "rest", ie. the non-null sequence. This handles the recursion.

E' = epsilon|+TE'

And we modify the original rule to use the new nonterminal to handle the recursion.

E = TE'

Community
  • 1
  • 1
Daniel Rose
  • 17,233
  • 9
  • 65
  • 88
2

As others have pointed out, there is a general procedure for replacing left recursion with right recursion. The other answers show well how to use that general procedure to remove the left recursion in the given grammar.

Here is a solution that restructures the given grammar in a way that is specific for that grammar.

E= T+E|T
T= F*T|F
F= a|b|c
Matthew T. Staebler
  • 4,756
  • 19
  • 21
  • So you're actually saying that your proposed grammar in left-recursion free ? i am wondering because i came across it in a course in Formal Languages and it was stated left-recursive ? – Ibrahim Najjar Aug 02 '11 at 14:40
1

the production

E = E+T
  | T
T = T*F
  | F
F = a|b|c

goes to

E= T ('+' T)*
T= F ('*' F)*
F= a|b|c

To maintain the same processing where the fist E disjunct is processed with A(E,T). the new version uses:

ret = T1;
while(set.more()) ret = A(ret, set.pop_front().T);
BCS
  • 75,627
  • 68
  • 187
  • 294