I'm writing a parser which has some tokens that are concatenated from multiple smaller rules, using yymore()
.
If it reaches EOF before the end of this composite token, I need it to return a special error-token to the parser. This is the same problem as in this question.
The answer there suggests to convert the parser to a "push parser" to solve this.
The Bison manual makes it pretty clear how to make a push parser part but I cannot find a similar instruction on how the lexer should look.
Let's take the following lexer:
%option noyywrap
%{
#include <string.h>
// Stub of the parser header file:
#define GOOD_STRING 1000
#define BAD_STRING 1001
char *yylval;
%}
%x STRING
%%
\" { BEGIN(STRING); yymore(); }
<STRING>{
\" { BEGIN(INITIAL); yylval = strdup(yytext); return GOOD_STRING; }
.|\n { yymore(); }
<<EOF>> { BEGIN(INITIAL); yylval = strdup(yytext); return BAD_STRING; }
}
.|\n { return yytext[0]; }
%%
void parser_stub()
{
int token;
while ((token = yylex()) > 0) {
if (token < 1000) {
printf("%i '%c'\n", token, (char)token);
} else {
printf("%i \"%s\"\n", token, yylval);
free(yylval);
}
}
}
int main(void)
{
parser_stub();
}
It doesn't work as a pull-parser because it continues parsing after encountering EOF, which ends in an error: fatal flex scanner internal error--end of buffer missed
.
(It works if yymore()
is not used but it still technically is an undefined behavior.)
In the rule <<EOF>>
it needs to emit 2 tokens: BAD_STRING
and 0
.
How do you convert a lexer into one suitable for a push-parser?
I'm guessing it involves replacing return
s with something that pushes a token to the parser without ending yylex()
but I haven't found a mention of such function / macro.
Is this just a case of having to implement it manually, without any support built-in into Flex?