4

I have the following code snippet that I wrote in flex. I need to display this message:

{printf("\n%-20s%-30s%-10s\n", "Lexeme", "Unite lexicale", "Indice");}

First thing after the user input, I tried to find a solution but nothing seems to work.

%{
int i=1;
%}
lettre [a-zA-Z]+
nombre_entier (\+|\-)?[0-9]+
nombre_reel (\+|\-)?[0-9]+\.[0-9]+((e|E)(\-|\+)?[0-9]+)?
id {lettre}({lettre}|[0-9])*
%%
\$              { exit(0);}
[ \t]+          {/*ignorer*/}
\n              {i=1;}
ENTIER|REEL     {printf("%-20s%-30s%-10d\n",yytext, "Mot_cle", i++);
                 printf("-----------------------------------------------------\n");}
{id}            {printf("%-20s%-30s%-10d\n",yytext, "ID", i++);
                 printf("------------------------------------------------------\n");}
{nombre_entier} {printf("%-20s%-30s%-10d\n",yytext, "nombre entier", i++);
                 printf("------------------------------------------------------\n");}
{nombre_reel}   {printf("%-20s%-30s%-10d\n",yytext, "nombre reel", i++);
                 printf("------------------------------------------------------\n");}
\(              {printf("%-20s%-30s%-10d\n",yytext, "parenthese ouvrante", i++);
                 printf("------------------------------------------------------\n");}
")"             {printf("%-20s%-30s%-10d\n",yytext, "parenthese fermante", i++);
                 printf("------------------------------------------------------\n");}
"+"|"-"|"*"|"/" {printf("%-20s%-30s%-10d\n",yytext, "operateur arithmetique", i++);
                 printf("------------------------------------------------------\n");}
"="             {printf("%-20s%-30s%-10d\n",yytext, "operateur d'affectation", i++);
                 printf("------------------------------------------------------\n");}
","             {printf("%-20s%-30s%-10d\n",yytext, "Virgule", i++);
                 printf("------------------------------------------------------\n");}
";"             {printf("%-20s%-30s%-10d\n",yytext, "Point virgule", i++);
                 printf("------------------------------------------------------\n");}
.               {printf("%-20s%-30s%-10d\n",yytext, "caractere inconnu", i++);
                 printf("------------------------------------------------------\n");}
%%
int main(){
    printf("Entrez le texte a analyser : \n");
    yylex();
    return 0;
}
int yywrap(){
    return 1;
}

Please help.

rici
  • 234,347
  • 28
  • 237
  • 341
user259584
  • 65
  • 1
  • 7
  • Simply insert your print statement immediately after the call to `yylex();`. – DYZ Jan 30 '17 at 22:53
  • thnx for the reply but what i want is to display the message before `yylex()` itself displays anything – user259584 Jan 30 '17 at 22:55
  • I think you need to redefine `YY_INPUT`, then. Look at the standard definition of the macro here: https://ftp.gnu.org/old-gnu/Manuals/flex-2.5.4/html_node/flex_10.html. Add a static variable that is false before the first use of `YY_INPUT` and true thereafter, and print your message just before setting the variable to true for the first time. – DYZ Jan 30 '17 at 22:59
  • sorry @DYZ but i don't see how i can redefine YY_INPUT without messing up the entire code – user259584 Jan 30 '17 at 23:17
  • Why not? (See example http://stackoverflow.com/questions/1920604/how-to-make-yy-input-point-to-a-string-rather-than-stdin-in-lex-yacc-solaris) – DYZ Jan 30 '17 at 23:18

2 Answers2

2

A cleaner solution is to use the flex scanner in the intended manner, which is to successively return lexical tokens to its caller, one token per call.

That means that you need some way for the scanner to identify what kind of token it encountered. Usually, you will use an enumeration (or, more traditionally, a collection of #defines), which are placed in a header file which can be included by both the scanner and its callers. If you use a parser generator such as yacc or bison, this header will be generated for you automatically.

You also need some way for the scanner to return the "semantic value" of the token. In this simple case, that is not necessary since you do nothing with the token's value other than print it out immediately. That makes it possible to use the yytext global variable (if your scanner uses global variables), but using yytext outside of a flex action is a bug waiting to happen, since yytext and the buffer it is pointing to are part of the scanner's internal state and the contents can and will change without warning. You can get it away with it here because nothing can change until the next call to yylex.

In practice, it might look something like this:

File: tokens.h

enum Token {
  T_FIN = 0,
  T_MOTCLE,
  T_ID,
  T_ENTIER,
  T_REEL,
  T_OUVRANTE,
  T_FERMANTE,
  T_OPERATEUR,
  T_AFFECT,
  T_VIRGULE,
  T_POINT_VIRGULE,
  T_INCONNU
};

const char* decrire(int jeton);

Now, we also need some way to associate these enum values with a human-readable description. The simple way is to just make a table of strings in the same order as the values. In production code, you might want to chose something more maintainable. Remember that token code 0 is conventionally used to indicate the end of input, so you need to leave room for it.

File: tokens.c

#include <stdio.h>
#include "tokens.h"

static const char* descriptions = {
  "Fin d'entree",
  "Mot_cle",
  "ID",
  "Nombre entier",
  "Nombre reel",
  "Parenthese ouvrante",
  "Parenthese fermante"
  "Operateur arithmetique",
  "Operateur d'affectation",
  "Virgule",
  "Point virgule",
  "Caractere inconnu"
};

const char* decrire(int jeton) {
  if (jeton >= 0 && jeton <= T_INCONNU)
    return descriptions[jeton];
  else
    return "???";  /* This indicates a bug somewhere */
}

Now we can write a very simple application for this lexer, which prints out the token stream:

int main() {
  puts("Entrez le texte a analyser : ");
  int jeton = yylex();
  printf("\n%-20s%-30s%-10s\n", "Lexeme", "Unite lexicale", "Indice");
  puts("-----------------------------------------------------");
  for (int i = 1; jeton; jeton = yylex();) {
    printf("%-20s%-30s%-10d\n", yytext, decrire(jeton), token_count++);
    puts("------------------------------------------------------");
  }
  return 0;
}

Finally, the somewhat cleaned-up lexer:

File tokens.l

%{
  #include "tokens.h"
  int token_count;
%}

%options noinput nounput noyywrap nodefault

%%
"$"                     { return T_FIN; }
[ \t]+                  { /*ignorer*/ }
\n                      { token_count = 1; }
ENTIER|REEL             { return T_MOTCLE; }
[[:alpha:]][[:alnum:]]* { return T_ID; }
[+-]?[[:digit:]]*       { return T_ENTIER; }
[+-]?[[:digit:]]+\.[[:digit:]]+([eE][+-]?[[:digit:]]*)? {
                          return T_REEL; }
"("                     { return T_OUVRANTE; }
")"                     { return T_FERMANTE; }
[-+*/]                  { return T_OPERATEUR; }
"="                     { return T_AFFECT; }
","                     { return T_VIRGULE; }
";"                     { return T_POINT_VIRGULE; }
.                       { return T_INCONNU; }

The handling of token counting (and the absence of line counting) is less than ideal; the ideal solution would be to use the standard yylloc global (or argument to a reentrant scanner) to hold location information, adding token count to that information.

Community
  • 1
  • 1
rici
  • 234,347
  • 28
  • 237
  • 341
0

Thank you @DYZ here is what i ended up doing

%{
int i=1,j=0;
#define YY_INPUT(buf,result,max_size) \
    { \
    int c = getchar(); \
    result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
    if(j++ == 0) \
        { \
        printf("\n%-20s%-30s%-10s\n", "Lexeme", "Unite lexicale", "Indice"); \
        printf("-----------------------------------------------------\n"); \
        } \
    }
%}
user259584
  • 65
  • 1
  • 7