C is parsed strictly in order, i.e. everything has to be declared before it is used; in particular, types must be declared before variables of those types. This makes sense because the grammar would be ambiguous if you didn't know what was the name of a type and what wasn't, e.g. a * b
depends on whether a
names a type.
On the other hand, some C family languages have the desirable property of relaxing this restriction (thus eliminating manual juggling of header files). I'm writing a parser for a C-superset language which is intended to likewise have that restriction relaxed, so now I need to figure out how to do it.
One method that occurs to me would be to do two passes. The first pass goes through everything, taking advantage of the fact that everything at the top level must be a declaration, not a statement, and picks up all the types. At this stage function bodies are left unexamined, just picked up as token streams delimited by matching braces. The second pass parses function bodies. Local declarations within a function would have to be in order, but that's not really a problem.
Are there any stumbling blocks in that method I haven't thought of?
How do compilers for C++, Java, C# etc. typically handle it for those parts of those languages that don't require declarations in order?