I'll refer to this question for some of the background:
Regular expression for a string literal in flex/lex
The problem I am having is handling the input with escaped characters in my lexer and I think it may be an issue to do with the encoding of the string, but I'm not sure.
Here's is how I am handling string literals in my lexer:
\"(\\.|[^\\"])*\"
{
char* text1 = strndup(yytext + 1, strlen(yytext) - 2);
char* text2 = "text\n";
printf("value = <%s> <%x>\n", text1, text1);
printf("value = <%s> <%x>\n", text2, text2);
}
This outputs the following:
value = <text\n"> <15a1bb0>
value = <text
> <7ac871>
It appears to be treating the newline character separately as a backslash followed by an n.
What's going on here, how do I process the text to be identical to the C input?