1

I am making a simple compiler, and I use flex and a hashtable (unordered_set) to check if an input word is an identifier or a keyword.

%{
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <unordered_set>
using std::unordered_set;
void yyerror(char*);
int yyparse(void);

typedef unordered_set<const char*> cstrset;
const cstrset keywords = {"and", "bool", "class"};
%}
%%
[ \t\n\r\f]             ;
[a-z][a-zA-Z0-9_]*      {   if (keywords.count(yytext) > 0)
                                printf("%s", yytext);
                            else
                                printf("object-identifier"); };

%%

void yyerror(char* str) {printf("ERROR: Could not parse!\n");}
int yywrap() {}

int main(int argc, char** argv)
{
    if (argc != 2) {printf("no input file");}
    FILE* file = fopen(argv[1], "r");
    if (file == NULL) {printf("couldn't open file");}
    yyin = file;
    yylex();
    fclose(file);
    return 0;
}

I tried with an input file that has only the word "class" written, and the output is object_identifier, not class.

I tried with a simple program, without using flex and the unordered_set works fine.

int main()
{
    cstrset keywords = {"and", "class"};
    const char* str = "class";
    if (keywords.count(str) > 0)
        printf("works");
    return 0;
}

What could be the problem?

splash
  • 13,037
  • 1
  • 44
  • 67
devil0150
  • 1,350
  • 3
  • 13
  • 36

1 Answers1

1

Use unordered_set<string> instead of your unordered_set<const char*>. You are trying to find the pointer to the char array that obviously cannot exist inside your defined variable.

W.F.
  • 13,888
  • 2
  • 34
  • 81
  • Yes, that would probably work, but using const char* will help me later on, so I will switch to string only if this is impossible. And I don't think it's the pointer problem. I tried in a separate test program without using flex, and it worked. I'll edit the question to show this. – devil0150 Feb 27 '16 at 18:10
  • Why would you prefer using const char *? You can always use .c_str() on your string objects to extract this type of value... – W.F. Feb 27 '16 at 18:12
  • 1
    Your exemplary program probably works because of compiler optimizations -- the compiler sees that there are two const char arrays with the same value and so it uses the same pointer in each usage... – W.F. Feb 27 '16 at 18:18
  • What I actually meant is that I might have to change a lot of code if I start using string now. – devil0150 Feb 27 '16 at 18:19
  • OK I see... You could try to create your own hash function for const char * to make your unordered_set work as you expect. See: http://stackoverflow.com/questions/20649864/c-unordered-map-with-char-as-key – W.F. Feb 27 '16 at 18:20
  • You were right about the pointers. Just tried printing out the address. Thank you for the link – devil0150 Feb 27 '16 at 18:26