1

Possible Duplicate:
problem getting c-style comments in flex/lex

I am writing a lexical analyzer using flex how can I make it avoid the comments that look like this:

/* COMMENTS */
Community
  • 1
  • 1
flashdisk
  • 3,652
  • 3
  • 16
  • 22

1 Answers1

4

It is a bit complicated. Here is a solution I found:

<INITIAL>{
"/*"              BEGIN(IN_COMMENT);
}
<IN_COMMENT>{
"*/"      BEGIN(INITIAL);
[^*\n]+   // eat comment in chunks
"*"       // eat the lone star
\n        yylineno++;
} { return COMMENT; }

The "obvious" solution, something like this:

"/*".*"*/" { return COMMENT; }

will match too much.

rici
  • 234,347
  • 28
  • 237
  • 341
Thomas Padron-McCarthy
  • 27,232
  • 8
  • 51
  • 75
  • But what if the comment was something like this /*\n*/ the "." will match every character except the newline which is "\n" and what does the { return COMMENT; } do? where does it return? cant I do just like this "/*".*"*/" ; – flashdisk Nov 10 '12 at 10:13
  • Flex-generated scanners are greedy, and will match as much as possible, so **"/*".*"*/"** would match all of **/*foo*/fum*/** and not just **/*foo*/.** The **{ return COMMENT; }** part just returns the token code from the Flex-generated scanner. In most cases you'll want to ignore comments, and then you should remove the return statement. – Thomas Padron-McCarthy Nov 10 '12 at 13:01
  • If the flex scanner is greedy as you have said so the "obvious" solution above is so wrong because of this as an example:/* a comment */ do_my_thing( "oops */" ); – flashdisk Nov 10 '12 at 14:06
  • Yes, as I wrote: The "obvious" solution will match too much. Or do I misunderstand you? – Thomas Padron-McCarthy Nov 10 '12 at 16:22