0

I'm coding a .cpp parser in C#, and I need to detect for/if/while statements. At first I thought that trimming the lines and checking if current line starts with for/if/while would be enough. However I've been told that there may be some exceptions and that solution wouldn't work properly. Which exceptions are there that I should think of? Which characters can occur before a statement? Is there an easier way to do it?

prajmus
  • 3,171
  • 3
  • 31
  • 41
Burak Özmen
  • 865
  • 2
  • 11
  • 28
  • 2
    possible duplicate of [How to write simple parser for if and while statements?](http://stackoverflow.com/questions/7294238/how-to-write-simple-parser-for-if-and-while-statements) – Leo Chapiro Aug 07 '14 at 10:20
  • Surely all of the answers to all of these kinds of questions would be in example cpp files? –  Aug 07 '14 at 10:20
  • e.g., ( var == 5) ? "yes" : "no"... and also are you seeking C++ or C# solution – NirMH Aug 07 '14 at 10:21
  • 1
    @JoachimPileborg there may be no space before `for`. `if(1==1){}for(...){}` is perfectly valid – prajmus Aug 07 '14 at 10:23
  • @NirMH seeking a C# solution – Burak Özmen Aug 07 '14 at 10:29
  • @JoachimPileborg it can always be a string body `"{if}"` – prajmus Aug 07 '14 at 10:34
  • When you say ".cpp parser", do you actually mean a full blown parser, able to parse all valid C++ code (according to a version of some standard) or are you parsing C-like code which can be more constrained? – Lasse V. Karlsen Aug 07 '14 at 10:47
  • @LasseV.Karlsen A valid C++ code. – Burak Özmen Aug 07 '14 at 11:03
  • Then you need to write a full blown parser. You can't read this line by line, C++ code is much much much more complex than that. The topic of a full blown parser is too broad for a SO question. You will need to get a book on the topic. – Lasse V. Karlsen Aug 07 '14 at 11:12

4 Answers4

2

Trimming the line won't work if it looks like this:

/* hello */ while(true) ;

or this:

/*
   while(true) ;
 */

You'll need to (at the least) pre-process the file (if it's C++)

Sean
  • 60,939
  • 11
  • 97
  • 136
1

You may have several statements in one line, code like

f(1); while(x > 0)
{
}

though not elegant is perfectly valid. Generally C++ is too complicated language for such solutions as checking if line starts with something.

Wojtek Surowka
  • 20,535
  • 4
  • 44
  • 51
1

you may encounter a function like this or may be a lambda function.

void max(a,b){if (a>b) return a; return b;}

You can use Regular expression for this

smali
  • 4,687
  • 7
  • 38
  • 60
1

The way compilers work is they run the source code through a lexer, which converts the source into Tokens or a Token graph.

You'll need to create one for C#. The best place to start is probably by looking at the gcc compiler for linux: https://gcc.gnu.org/onlinedocs/cppinternals/Lexer.html#Lexer

Of if you want to just explore along, you could get a very rough tokenization by using String.Split and passing in all expression terminators:

var expressionTerminators = new []{';','{','}'};

var sourceTokens = sourceCode.Split(expressionTerminators);

var forIfWhileStatements = sourceTokens.Where(
          x => x.ToLower().StartsWith("if") || 
               x.ToLower().StartsWith("for") || 
               x.ToLower().StartsWith("while"));

But again, this is a non-ideal approach.

Philip Pittle
  • 11,821
  • 8
  • 59
  • 123