0

I am writing a compiler for my toy language for learning and curiosity purposes only and I already have accomplished some things, like math operations and stuff, I am having some trouble trying to parse a variable "within" a variable though. Take these examples in pseudo-code:

numbers = [1..20]

My parser already understand this, and stuff like:

ten = numbers[9]

However, when I have a variable that is composed of more than one "element", like a 2D array or an object inside an object, I cannot parse, the parser blocks itself inside an infinite loop. Example:

things = [['one'], ['two'], ['three']]
person = {
  name = 'john'
  car = {
    model = 'ninja'
    price = 23000.0
  }
}

things[1][0]  // This is causing the infinite loop
person.car.price  // And this

This is what my parser looks like (in pseudo-code):

parseVarReference() {
  variable = expect(TOKEN_IDENTIFIER);
  if (current() == '[') {
    // try to parse array or object or fallback to regular variables
  }
}

Now, in my head, I would parse a nested variable like this:

parseVarReference() {
  variable = parseVarReference();
  if (current() == '[') {
    // try to parse array or object or fallback to regular variables
  }
}

Obviously this is causing an infinite loop and I cannot think in a way of solving this without breaking my parser.

Jay Kominek
  • 8,674
  • 1
  • 34
  • 51
  • 1
    I don't see any hint of a grammar here, just some kind of parsing code. Your life will be tremendously easier if you write a formal grammar for your language first. Usually with such a grammar, you can code a recursive descent parser quite easily (see my comment on Kominek's answer), or use a parser generator. – Ira Baxter May 02 '15 at 02:18
  • You're not parsing variables here, you're parsing the primary of an expression. – user207421 May 02 '15 at 02:52

1 Answers1

2

You're writing a recursive descent parser. They can't directly handle left recursion.

You can produce equivalent grammars that aren't left recursive, but it might be more work than you wanted, in the simple recursive descent style that you're writing.

You might consider generating your parser with a tool like yacc. Depending on the tool, they can handle these things easily, or at least (by separating grammar from actions) make it easier to perform transformations.

Jay Kominek
  • 8,674
  • 1
  • 34
  • 51
  • 3
    They can handle left recursion, after you convert it to iteration. See this answer on how to write a recursive descent parser: http://stackoverflow.com/questions/2245962/is-there-an-alternative-for-flex-bison-that-is-usable-on-8-bit-embedded-systems/2336769#2336769 – Ira Baxter May 02 '15 at 02:16
  • There is also a way to handle left recursion in recursive descent parsing with memoisation: http://www.vpri.org/pdf/tr2007002_packrat.pdf – SK-logic May 05 '15 at 10:37