3

I've been stuck with this since a while now. I want to parse something as simple as:

LIKES: word1 word2 .. wordN HATES: word1 word2 .. wordN

I am using Lemon+Flex. At the moment my Grammar looks something like this :

%left LIKES MOODS FROM HATES INFO.

%syntax_error {  
  std::cout << "Syntax error!" << std::endl;  
}   

final ::= likes_stmt.
final ::= hates_stmt.

likes_stmt ::= LIKES list(A). { Data *data=Data::getInstance();data->likes.push_back(A);}
hates_stmt ::= HATES list(A). { Data *data=Data::getInstance();data->hates.push_back(A);}

list ::= likes_stmt VALUE(A).   { Data *data=Data::getInstance();data->likes.push_back(A);}
list ::= hates_stmt VALUE(A).   { Data *data=Data::getInstance();data->hates.push_back(A); }

list(A) ::= VALUE(B).           {A=B;}

But this only works for first 2 words. Clearly I am doing something wrong , probably in the recursive definition ? Any heads up is appreciated :)

Artem Zankovich
  • 2,319
  • 20
  • 36
crozzfire
  • 186
  • 1
  • 1
  • 13

2 Answers2

2

@crozzfire, Ira provided correct answer for your original question, consider voting for it.

Let me answer to the question with you additional requirement to separate parsed values into two lists. Don't create different rules for parsing of these lists since the grammar of list is the same for both cases. What you need is a flag to indicate whether LIKES or HATES was found in front of list. The 4th parameter of Lemon's Parse function suits best for this needs. See "The Parser Interface" section of Lemon documentation.

Below is updated Ira's grammar that sets and check such flag variable. Take note that rules set_likes_state and set_hites_state need to be placed just before LIKES and HATES token to have associated action executed when tokens are reduced.

    %extra_argument {unsigned* state}

    final ::= likes_stmt.
    final ::= hates_stmt.

    likes_stmt ::= set_likes_state LIKES list(A).
    hates_stmt ::= set_hites_state HATES list(A).

    list ::= list VALUE(A).   { if (*state == 0) {/*add A to list1*/} else {/*add A to list2*/}; }
    list ::= VALUE(A).        { if (*state == 0) {/*add A to list1*/} else {/*add A to list2*/}; }

    set_likes_state ::= .     { *state = 0; }
    set_hites_state ::= .     { *state = 1; }
Artem Zankovich
  • 2,319
  • 20
  • 36
  • 1
    Usually the way this is done is to parse first, and the postprocess the tree to collect information into various categories. That way you don't mangle the grammar with artificial productions ("set_likes", etc.) whose only job is to signal to "while-parsing" actions. In more complex languages, these signals mostly just create grief because the clutter the grammar and tangle parsing with work. However, if this is *all* OP needs to do, then this answer is fine. (Thanks for the upvote!) – Ira Baxter Aug 10 '12 at 09:17
2

It looks to me that your likes_stmt is defined in terms of list, and list is defined in terms of likes. I'm surprised it works for any words at all. It could be that I don't understand LEMON syntax (I sure don't get the list(A) bit), but grammars BNFs tend to be pretty similar.

I'd expect your grammar to look more like:

 final = likes_stmt ;

 likes_stmt = LIKES list ;
 likes_stmt = HATES list ;


  list = value ;
  list = list value ;

Of course this would only recognize one LIKES phrase, or one HATES phrase, but not both that same time or in order as implied by line 2 of your question.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Thanks for your answer. But I'm afraid that I had already tried that. Basically, what I am trying to do is push the strings to their respective places (LIKES or HATES). In future , I will have more reserved tokens such as INFO , MOODS etc. Its very similar to the Google's advanced search syntax. – crozzfire Jul 21 '11 at 03:15
  • Parser generators are pretty easy to use, and pretty robust in parsing. I suggest you get rid of all the extra semantic action stuff in your grammar, e.g., reduce it to pretty much just what I have written, and try it again. If that works, start adding your semantic actions back. – Ira Baxter Jul 21 '11 at 03:45