0

I'm having problems understanding an online explanation of how to remove the left recursion in this grammar. I know how to remove direct recursion, but I'm not clear how to handle the indirect. Could anyone explain it?

A --> B x y | x 
B --> C D
C --> A | c
D --> d
Prune
  • 76,765
  • 14
  • 60
  • 81
Kevin
  • 667
  • 2
  • 7
  • 18
  • 2
    How can we help you understand a particular explanation, when you haven't included it in your post? – Prune Sep 06 '17 at 23:08
  • http://web.cs.wpi.edu/~kal/images/PLT/PLTex4.3.gif – Kevin Sep 06 '17 at 23:11
  • 2
    Please read and follow the posting guidelines in the help documentation; [how to ask](http://stackoverflow.com/help/how-to-ask) applies here. – Prune Sep 06 '17 at 23:26

1 Answers1

1

The way I learned to do this is to replace one of the offending non-terminal symbols with each of its expansions. In this case, we first replace B with its expansions:

A --> B x y | x
B --> C D

becomes

A --> C x y | D x y | x

Now, we do the same for non-terminal symbol C:

A --> C x y | D x y | x
C --> A | c

becomes

A --> A x y | c x y | D x y | x

The only other remaining grammar rule is

D --> d

so you can also make that replacement, leaving your entire grammar as

A --> A x y | c x y | d x y | x

There is no indirect left recursion now, since there is nothing indirect at all.

Also see here.


To eliminate left recursion altogether (not merely indirect left recursion), introduce the A' symbol from your own materials (credit to OP for this clarification and completion):

A -> x A' 
A' -> xyA' | cxyA' | dxyA' | epsilon

Response to naomik's comments

Yes, grammars have interesting properties, and you can characterize certain semantic capabilities in terms of constraints on grammar rules. There are transformation algorithms to handle certain types of parsing problems.

In this case, we want to remove left-recursion: one desirable property of a grammar is that the use of any rule must consume at least one input token (terminal symbol). Left-recursion opens a door to infinite recursion in the parser.

I learned these things in my "Foundations of Computing" and "Compiler Construction" classes many years ago. Instead of writing a parser to adapt to a particular grammar, we'd transform the grammar to fit the parser style we wanted.

Prune
  • 76,765
  • 14
  • 60
  • 81
  • so final answer would be A -> x A' and A' -> xyA' | cxyA'|dxyA'|epsilon? – Kevin Sep 06 '17 at 23:24
  • I guess I don't understand how you to got to those steps to begin with. – Kevin Sep 06 '17 at 23:32
  • Those steps come from my graduate class "Foundations of Computing", in our section on eliminating certain types of uncomfortable grammar rules. – Prune Sep 07 '17 at 00:00
  • I see now ... you need to eliminate left recursion altogether, not merely *indirect* left recursion. Yes, you nailed that final step. I'll update my answer. – Prune Sep 07 '17 at 00:01
  • ok thanks. Why is your answer so much different than the other one I was looking at? Is your solution just more simplified? – Kevin Sep 07 '17 at 00:29
  • can someone tell me what this notation is called ? – Mulan Sep 07 '17 at 04:54
  • @jean: Yes; the recursion-removal process I applied first greatly simplifies the grammar, as a side effect of eliding the other three non-terminals. – Prune Sep 07 '17 at 21:04
  • @naomik: do you mean the grammar notation? I'm not sure whether the format (left side, arrow, right side) has a specific name, but certain grammar types *do* have names. The most common in my day were **BNF** (Bachus-Nauer Form), **CNF** (Chomsky Normal form), and **GNF** (Greibach Normal Form). – Prune Sep 07 '17 at 21:07
  • @prune im familiar with grammar *types* - ive just never seen recursion patterns discussed/reasoned about with a notation. its intriguing and id like to see more on this specific topic – Mulan Sep 07 '17 at 22:27
  • Sorry for adding this comment long afterward, but I am reviewing left recursion removal because I am implementing this as well as other transformations for an extension for Antlr. Is this answer correct? The unfold of B (substituting the RHS of B into the other rules that reference B) in "A --> B x y | x; B -> C D;" should be "A -> C D x y | x;", otherwise the original grammar should have "B -> C | D;". – kaby76 Apr 23 '20 at 13:14
  • Correct. You would then unfold this to `A -> C d x y | x`, and finally to `A -> A d x y | c d x y | x`. You now have the encapsulated problem of *direct* left-recursion in only one rule. – Prune Apr 23 '20 at 18:23