I'm coding a .cpp parser in C#, and I need to detect for/if/while
statements. At first I thought that trimming the lines and checking if current line starts with for/if/while
would be enough. However I've been told that there may be some exceptions and that solution wouldn't work properly. Which exceptions are there that I should think of? Which characters can occur before a statement? Is there an easier way to do it?

- 3,171
- 3
- 31
- 41

- 865
- 2
- 11
- 28
-
2possible duplicate of [How to write simple parser for if and while statements?](http://stackoverflow.com/questions/7294238/how-to-write-simple-parser-for-if-and-while-statements) – Leo Chapiro Aug 07 '14 at 10:20
-
Surely all of the answers to all of these kinds of questions would be in example cpp files? – Aug 07 '14 at 10:20
-
e.g., ( var == 5) ? "yes" : "no"... and also are you seeking C++ or C# solution – NirMH Aug 07 '14 at 10:21
-
1@JoachimPileborg there may be no space before `for`. `if(1==1){}for(...){}` is perfectly valid – prajmus Aug 07 '14 at 10:23
-
@NirMH seeking a C# solution – Burak Özmen Aug 07 '14 at 10:29
-
@JoachimPileborg it can always be a string body `"{if}"` – prajmus Aug 07 '14 at 10:34
-
When you say ".cpp parser", do you actually mean a full blown parser, able to parse all valid C++ code (according to a version of some standard) or are you parsing C-like code which can be more constrained? – Lasse V. Karlsen Aug 07 '14 at 10:47
-
@LasseV.Karlsen A valid C++ code. – Burak Özmen Aug 07 '14 at 11:03
-
Then you need to write a full blown parser. You can't read this line by line, C++ code is much much much more complex than that. The topic of a full blown parser is too broad for a SO question. You will need to get a book on the topic. – Lasse V. Karlsen Aug 07 '14 at 11:12
4 Answers
Trimming the line won't work if it looks like this:
/* hello */ while(true) ;
or this:
/*
while(true) ;
*/
You'll need to (at the least) pre-process the file (if it's C++)

- 60,939
- 11
- 97
- 136
You may have several statements in one line, code like
f(1); while(x > 0)
{
}
though not elegant is perfectly valid. Generally C++ is too complicated language for such solutions as checking if line starts with something.

- 20,535
- 4
- 44
- 51
you may encounter a function like this or may be a lambda function.
void max(a,b){if (a>b) return a; return b;}
You can use Regular expression for this

- 4,687
- 7
- 38
- 60
The way compilers work is they run the source code through a lexer, which converts the source into Tokens or a Token graph.
You'll need to create one for C#. The best place to start is probably by looking at the gcc
compiler for linux: https://gcc.gnu.org/onlinedocs/cppinternals/Lexer.html#Lexer
Of if you want to just explore along, you could get a very rough tokenization by using String.Split and passing in all expression terminators:
var expressionTerminators = new []{';','{','}'};
var sourceTokens = sourceCode.Split(expressionTerminators);
var forIfWhileStatements = sourceTokens.Where(
x => x.ToLower().StartsWith("if") ||
x.ToLower().StartsWith("for") ||
x.ToLower().StartsWith("while"));
But again, this is a non-ideal approach.

- 11,821
- 8
- 59
- 123