I want to parse a string which I give to the parser in the main function of yacc . I know that this could be done by using yy_scan_string
but I don't know how to use it. I searched the web and the man pages but it is still not clear to me. Please help me.

- 23,523
- 27
- 108
- 164

- 363
- 3
- 6
- 14
-
Closely related to: http://stackoverflow.com/q/1920604/15168 and http://stackoverflow.com/q/1909166/15168 (though not quite a duplicate of either). – Jonathan Leffler Sep 23 '12 at 17:25
6 Answers
In case anyone needs the sample for a re-entrant lexer:
int main(void)
{
yyscan_t scanner;
YY_BUFFER_STATE buf;
yylex_init(&scanner);
buf = yy_scan_string("replace me with the string youd like to scan", scanner);
yylex(scanner);
yy_delete_buffer(buf, scanner);
yylex_destroy(scanner);
return 0;
}

- 1,689
- 15
- 12
-
2In case anyone else is getting symbol is not defined or other such errors when trying this: remember to include `%option reentrant` in the lexer file. – chacham15 Apr 05 '14 at 00:19
This works for me. I have this code in the subroutines section (i.e. the third section) of my Bison file:
struct eq_tree_node *parse_equation(char *str_input)
{
struct eq_tree_node *result;
yy_scan_string(str_input);
yyparse();
/* to avoid leakage */
yylex_destroy();
/* disregard this. it is the function that I defined to get
the result of the parsing. */
result = symtab_get_parse_result();
return result;
}

- 381
- 4
- 4
-
1how do you declare yy_scan_string in first section of bison? Also do I need to add anything in flex? – Ruturaj Mar 20 '15 at 15:27
-
This worked for me ... use yy_scan_string()
int main(int argc, char **argv)
{
char Command[509];
int ReturnVal;
char input[40] = "This is my input string";
/*Copy string into new buffer and Switch buffers*/
yy_scan_string (input);
/*Analyze the string*/
yylex();
/*Delete the new buffer*/
yy_delete_buffer(YY_CURRENT_BUFFER);
}

- 143
- 9
I always recommend this page to people who want to learn lex/yacc (or flex/bison)

- 5,411
- 22
- 38
-
1
-
The provided example does not use scan_string. Useful for general prupose, but not for the question – bra_racing Dec 31 '16 at 14:28
-
1This document does not contain any reference to yy_scan_string. This answer is more harmful then helpful for someone who is looking up info on this function. – jlanik Oct 07 '19 at 16:27
There's a few good answers here already. But for my purposes, I needed to repeatedly swap between string-buffers that were to be analysed. The problem here is that flex needs to clean-up after each processing run, and reset its internal parse-stuff/counters/etc. At the time of writing, none of the existing answers demonstrate this.
Essentially this amounts to keeping a YY_BUFFER_STATE yy_buffer_state;
around somewhere, and calling yy_delete_buffer( yy_buffer_state )
when it's time to switch between strings. When flex is assigned a new string to scan ( with yy_scan_string()
), a new YY_BUFFER_STATE is generated, which you need to track.
I've tried to show a reasonably complete example, but the money-shot is setLexerBuffer()
near the bottom ~
For example:
%{
#include "flex_tokens_and_yylval.h"
extern LexYYLVal yylval; // my custom yylval
extern YY_BUFFER_STATE yy_buffer_state;
%}
digit [0-9]
letter [a-zA-Z]
udderscore "_"
sign [+-]
period "."
real {sign}?({digit}*{period}{digit}+)
int {sign}?{digit}+
identifier ({letter}|{udderscore})+({letter}|{digit}|{udderscore})*
/* [...] rest of the scanner rules */
%%
<<EOF>> { return LEX_EOF; }
{real} {
yylval.data.val_real = strtod( yytext, NULL );
return LEX_REAL;
}
{int} {
yylval.data.val_integer = strtol( yytext, NULL, 10 );
return LEX_INTEGER;
}
{identifier} {
strncpy( yylval.data.val_string, yytext, MAX_IDENTIFIER_LENGTH );
yylval.data.val_string[MAX_IDENTIFIER_LENGTH-1]='\0';
return LEX_IDENTIFIER;
}
[ \t\n\r] { /* skip whitespace */ }
/* [...] rest the scanner outputs */
%%
// NOT THREAD SAFE, DON'T USE FROM MULTIPLE THREADS
LexYYLVal yylval;
int yy_first_ever_run = 1;
char LexEmptyBuffer[3] = { '\n', '\0', '\0' };
YY_BUFFER_STATE yy_buffer_state;
/*
* Point flex at a new string to process, resetting
* any old results from a previous parse.
*/
void setLexerBuffer( const char *expression_string )
{
/* out with the old (If any? How does flex know?) */
if ( !yy_first_ever_run )
{
// This doesn't cause any issues (according to valgrind)
// but I also don't see any reason to call it before the
// first lex-run.
yy_delete_buffer( yy_buffer_state );
}
else
{
yy_first_ever_run = 0;
}
/* just make sure we're pointing at something */
if ( expression_string == NULL )
{
expression_string = LexEmptyBuffer;
}
/* reset the scan */
yy_buffer_state = yy_scan_string( expression_string ); /* auto-resets lexer state */
}
So this lets you run a control-loop like:
int main( void )
{
LexResultToken token;
setLexerBuffer( "12.3 * 0.96" );
do
{
token = yylex();
printToken( token );
}
while( token != LEX_EOF );
setLexerBuffer( "( A + B ) < ( C * D )" );
do
{
token = yylex();
printToken( token );
}
while( token != LEX_EOF );
yylex_destroy();
return 0;
}
This example was run through valgrind to verify memory-correctness.

- 14,398
- 5
- 31
- 53