10

I want to parse a string which I give to the parser in the main function of yacc . I know that this could be done by using yy_scan_string but I don't know how to use it. I searched the web and the man pages but it is still not clear to me. Please help me.

Ross Rogers
  • 23,523
  • 27
  • 108
  • 164
ajai
  • 363
  • 3
  • 6
  • 14
  • Closely related to: http://stackoverflow.com/q/1920604/15168 and http://stackoverflow.com/q/1909166/15168 (though not quite a duplicate of either). – Jonathan Leffler Sep 23 '12 at 17:25

6 Answers6

20

In case anyone needs the sample for a re-entrant lexer:

int main(void)
{
    yyscan_t scanner;
    YY_BUFFER_STATE buf;
    yylex_init(&scanner);
    buf = yy_scan_string("replace me with the string youd like to scan", scanner);
    yylex(scanner);
    yy_delete_buffer(buf, scanner);
    yylex_destroy(scanner);
    return 0;
}
Eric
  • 1,689
  • 15
  • 12
  • 2
    In case anyone else is getting symbol is not defined or other such errors when trying this: remember to include `%option reentrant` in the lexer file. – chacham15 Apr 05 '14 at 00:19
9

This works for me. I have this code in the subroutines section (i.e. the third section) of my Bison file:

struct eq_tree_node *parse_equation(char *str_input)
{
    struct eq_tree_node *result;

    yy_scan_string(str_input);
    yyparse();
    /* to avoid leakage */
    yylex_destroy();

    /* disregard this. it is the function that I defined to get
    the result of the parsing. */
    result = symtab_get_parse_result();

    return result;
}
markonovak
  • 381
  • 4
  • 4
4

This worked for me ... use yy_scan_string()

int main(int argc, char **argv)
{
char Command[509];
int ReturnVal;

    char input[40] = "This is my input string";

    /*Copy string into new buffer and Switch buffers*/
    yy_scan_string (input);

    /*Analyze the string*/
    yylex();

    /*Delete the new buffer*/
    yy_delete_buffer(YY_CURRENT_BUFFER);
}
Rama
  • 143
  • 9
3

I always recommend this page to people who want to learn lex/yacc (or flex/bison)

Wernsey
  • 5,411
  • 22
  • 38
  • 1
    Not anymore when I checked it just now. – Wernsey Jun 04 '12 at 10:56
  • The provided example does not use scan_string. Useful for general prupose, but not for the question – bra_racing Dec 31 '16 at 14:28
  • 1
    This document does not contain any reference to yy_scan_string. This answer is more harmful then helpful for someone who is looking up info on this function. – jlanik Oct 07 '19 at 16:27
0

There's a few good answers here already. But for my purposes, I needed to repeatedly swap between string-buffers that were to be analysed. The problem here is that flex needs to clean-up after each processing run, and reset its internal parse-stuff/counters/etc. At the time of writing, none of the existing answers demonstrate this.

Essentially this amounts to keeping a YY_BUFFER_STATE yy_buffer_state; around somewhere, and calling yy_delete_buffer( yy_buffer_state ) when it's time to switch between strings. When flex is assigned a new string to scan ( with yy_scan_string() ), a new YY_BUFFER_STATE is generated, which you need to track.

I've tried to show a reasonably complete example, but the money-shot is setLexerBuffer() near the bottom ~

For example:

%{
#include "flex_tokens_and_yylval.h"

extern LexYYLVal yylval;                    // my custom yylval
extern YY_BUFFER_STATE yy_buffer_state;
%}

digit             [0-9]
letter            [a-zA-Z]
udderscore        "_"
sign              [+-]
period            "."

real              {sign}?({digit}*{period}{digit}+)
int               {sign}?{digit}+
identifier        ({letter}|{udderscore})+({letter}|{digit}|{udderscore})*
/* [...]  rest of the scanner rules */

%%

<<EOF>>             { return LEX_EOF; }
{real}              {
                        yylval.data.val_real = strtod( yytext, NULL ); 
                        return LEX_REAL;
                    }

{int}               {
                        yylval.data.val_integer = strtol( yytext, NULL, 10 );
                        return LEX_INTEGER;
                    }
{identifier}        {
                        strncpy( yylval.data.val_string, yytext, MAX_IDENTIFIER_LENGTH );
                        yylval.data.val_string[MAX_IDENTIFIER_LENGTH-1]='\0';
                        return LEX_IDENTIFIER;
                    }
[ \t\n\r]           { /* skip whitespace */ }
/* [...]  rest the scanner outputs */

%%

// NOT THREAD SAFE, DON'T USE FROM MULTIPLE THREADS
LexYYLVal yylval;
int yy_first_ever_run = 1;
char LexEmptyBuffer[3] = { '\n', '\0', '\0' };
YY_BUFFER_STATE yy_buffer_state;

/*
 * Point flex at a new string to process, resetting
 * any old results from a previous parse.
 */
void setLexerBuffer( const char *expression_string )
{
    /* out with the old (If any? How does flex know?) */
    if ( !yy_first_ever_run )
    {
        // This doesn't cause any issues (according to valgrind)
        // but I also don't see any reason to call it before the
        // first lex-run.
        yy_delete_buffer( yy_buffer_state );
    }
    else
    {
        yy_first_ever_run = 0;
    }

    /* just make sure we're pointing at something */
    if ( expression_string == NULL )
    {
        expression_string = LexEmptyBuffer;
    }

    /* reset the scan */    
    yy_buffer_state = yy_scan_string( expression_string );  /* auto-resets lexer state */
}

So this lets you run a control-loop like:

int main( void )
{
    LexResultToken token;

    setLexerBuffer( "12.3 * 0.96" );
    do
    {
        token = yylex();
        printToken( token );
    }
    while( token != LEX_EOF );

    setLexerBuffer( "( A + B ) < ( C * D )" );
    do
    {
        token = yylex();
        printToken( token );
    }
    while( token != LEX_EOF );

    yylex_destroy();

    return 0;
}

This example was run through valgrind to verify memory-correctness.

Kingsley
  • 14,398
  • 5
  • 31
  • 53
-1

I've found and example here to myself. May it can be usefull for you:

http://osdir.com/ml/lex.flex.windows/2003-04/msg00008.html

Abud
  • 1