I need some help here; I'm trying to make a few Regex
expressions in order to catch the word int
, any mathematical operations, any digits and =
signs in my code, while ignoring all the rest. The words which will be ignored will be set to false, while others true as shown in the code below.
This will be used to Tokenize the above mentioned keywords in order to implement a Lexer which can detect integer overflows. I need this done exclusively with Regex
.
I've already successfuly captured the word int
, mathematical operations and digits, but my Regex
can't seem to recognize any random words; such as variable names (number1, number2, etc) and any other words inside the language, such as if statements, round braces, curly brackets, etc...
lexer.AddDefinition(new TokenDefinition(
"(operator)",
new Regex(@"\*|\/|\+|\-"),
false));
lexer.AddDefinition(new TokenDefinition(
"(literal)",
new Regex(@"\d+"),
false));
lexer.AddDefinition(new TokenDefinition(
"(Random Word)",
new Regex(@"(?=.*[A-Z])(?=.*[a-z])"),
false));
lexer.AddDefinition(new TokenDefinition(
"(integer)",
new Regex(@"\bint\b"),
false));
lexer.AddDefinition(new TokenDefinition(
"(white-space)",
new Regex(@"\s+"),
true));
// This is not working. Random words such as variable names are not being captured by this.
lexer.AddDefinition(new TokenDefinition(
"(random-word)",
new Regex(@"\b(?=.*[A-Z])(?=.*[a-z])\b"),
true));
// What about the brackets? How can I implement a Regex to capture brackets?
This seems to be so simple but I can't get it done. Please share your views, any opinions are welcome.