-1

I created a small compiler and need help to fix it.

Code of my compiler:

t.l:

%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
%}
%x DOUBLE_QUOTES
%%

<INITIAL>[s|S][h|H][o|O][w|W]               {return show;}
<INITIAL>[a-zA-z]                           {yylval.id=yytext[0];return identifier;}
<INITIAL>[0-9]+                             {yylval.num=atoi(yytext);return number;}
<INITIAL>[\-\+\=\;\*\/]                     {return yytext[0];}
<INITIAL>["] {
    printf("(STRING_OPEN) ");
    BEGIN(DOUBLE_QUOTES);
}
<DOUBLE_QUOTES>["] {
    printf("(STRING_CLOSE) ");
    BEGIN(INITIAL);
    printf("(STRING:%S) ",yytext[1]);
}

%%
int yywrap (void) {return 1;}

t.y:

%{
void yyerror(char *s);
#include <stdio.h>
#include <stdlib.h>
int symbols[52];
int symbolVal(char symbol);
void updateSymbolVal(char symbol,int val);  
%}
%union {int num;char id;}
%start line
%token show
%token <num> number
%token <id> identifier
%type <num> line exp term
%type <id> assignment

%%
line    : assignment ';'        {;}
        | show exp ';'          {printf("showing : %d\n",$2);}
        | line assignment ';'   {;}
        | line show exp ';' {printf("showing : %d\n",$3);}
        ;
assignment: identifier '=' exp  {updateSymbolVal($1,$3);}
        ;
exp     : term                  {$$ = $1;}
        | exp '+' term          {$$ = $1 + $3;}
        | exp '-' term          {$$ = $1 - $3;}
        | exp '*' term          {$$ = $1 * $3;}
        | exp '/' term          {$$ = $1 / $3;}
        ;

term    : number                {$$ = $1;}
        | identifier            {$$ = symbolVal($1);}
%%
int computerSymbolIndex(char token)
{
    int idx=-1;
    if(islower(token))
    {
        idx=token-'a'+26;
    }
    else if(isupper(token))
    {
        idx = token - 'A';
    }
    return idx;
}
int symbolVal(char symbol)
{
    int bucket = computerSymbolIndex(symbol);
    return symbols[bucket];
}
void updateSymbolVal(char symbol,int val)
{
    int bucket = computerSymbolIndex(symbol);
    symbols[bucket] = val;
}
int main (void) {
    printf("Created By BoxWeb Inc\n");
    int i;
    for(i=0;i<52;i++)
    {
        symbols[i]=0;
    }
    return yyparse();
}

void yyerror (char *s) {printf("-%s at %s !\n",s );}

command for test compiler :

show 5+5;
show 5*2;
show 5+5-2*2/1;

i need to upgrade to (want can print string):

show "hello" . " " . "mr";//hello mr
show 5+5 . " ?";//10 ?
and more....

In the lexer I use :

<INITIAL>["] {
    printf("(STRING_OPEN) ");
    BEGIN(DOUBLE_QUOTES);
}
<DOUBLE_QUOTES>["] {
    printf("(STRING_CLOSE) ");
    BEGIN(INITIAL);
    printf("(STRING:%S) ",yytext[1]);
}

but I don't know how use this in a parser.

Please help me to complete this compiler.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user5899862
  • 31
  • 1
  • 8
  • I don't understand your question, what do you want to reach? In the moment it looks like you can just read single characters and and an integer. So all that works now should be `$c=32`. What do you want to reach? `$variable=123`? `$c="string"`? what are you going to do with the results? you probably won't have a bison parser for just one line, normally you build a tree out of the results, but it seem you just have a list of integers. – Rolf Lussi Mar 18 '16 at 15:17
  • thanks for the update, just one more question. how do you want to save the results in the end, and what are you going to do with them? Since you can't handle strings and integers same. I'll try to give you an example what i mean in the answer – Rolf Lussi Mar 18 '16 at 15:35

1 Answers1

0

Lets simplify it for a moment to just one possible operation

We have the following grammar

assignment: '$' identifier '=' exp ';'    {updateSymbolVal($2,$4); }
            ;
exp: number                               {$$ = createExp($1);}
   | string                               {$$ = createExp($1);}
   | exp '+' exp                          {$$ = addExp($1,$3);}
   ;

Since the expression can be many different things we can't just save it in a integer but need a more complex structure, something like this:

enum expType {NUMBER, STRING};
struct Exp{
    expType type;
    double number;
    std::string str;
};

Then we make the functions to create your expressions:

Exp* createExp(int v){
    Exp *e = new Exp();
    e->type = NUMBER;
    e->number = v;
    return e;
}

Exp* createExp(std::string s){
    Exp *e = new Exp();
    e->type = STRING;
    e->str = s;
    return e;
}

And then to do all your calculations and assignment you will always have to check the type.

Exp* addExp(Exp *a, Exp *b){
    Exp *c;
    if(a->type == NUMBER && b->type == NUMBER){
         c->type == NUMBER;
         c->number == a->number + b->number;
    }
    else{
         std::cout << "some nice error message\n";
    }
    return c;
}

Same with the assign function

void updateSymbolVal(const std::string &identifier, Exp *e){
    if(e->type == NUMBER){
        myNumbers[identifier] = e->number;
    }
    if(e->type == STRING){
        myStrings[identifier] = e->str;
    }
}

Of course you could also make a map/vector/array of the struct Exp if you need to do some more manipulations with it. Or just hand it over to the next level.

Edit for the question of multi-language support

As written in the comment I refer to this question Flex(lexer) support for unicode. To simplify it to your need here you can make it like this.

ASC     [a-zA-Z_0-9]
U       [\x80-\xbf]
U2      [\xc2-\xdf]
U3      [\xe0-\xef]
U4      [\xf0-\xf4]

UANY    {ASC}|{U2}{U}|{U3}{U}{U}|{U4}{U}{U}{U}

UANY+   {yylval.id = yytext[0]; return string;}
Community
  • 1
  • 1
Rolf Lussi
  • 615
  • 5
  • 16