14

OK, I know this looks like a duplicate of Why do functions need to be declared before they are used? but it doesn't seem like existing answers fully address all the details.

I know that C++ was originally designed in the 80's so it could be translated in a single pass, because computers were slow. OK. But the most recent standard was published in 2011, so I don't see why C++ compilers can't do things now that require multiple passes. It would still hurt performance, yes, but only if it actually became necessary. So the following would still only require a single pass:

void foo();
int main() { foo(); }
void foo() {}

whereas for the following, the compiler could make two (and be slower), because it doesn't know whether foo is a function or a type until it sees the declaration below:

int main() { foo(); }
void foo() {}

and if you tried to use a function without declaring it first, and the declaration is not in the current translation unit at all, then it would be an error. But if it's in the same translation unit then the compiler could just make additional passes.

My colleague argues that such a feature would save a lot of developer time, and would avoid issues with the declaration and definition not being matched. And I'm sure this has been proposed many times over and rejected every time. What is the actual reasoning behind rejecting it, i.e., the committee's rationale?

Community
  • 1
  • 1
Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • 3
    I doubt if it was ever considered by the committee because C++ has enforced type-safe linkage since very early on (long before the standardization process started), and the process of matching arguments and conversions of arguments to available function prototypes is hard enough already without having to wait until the end of the TU to resolve the first function call in the code (because it would _force_ two-pass processing because the compiler would have to gather all the function type information of every function before it could generate any code for any function call). – Jonathan Leffler Jul 24 '14 at 05:37
  • @JonathanLeffler Are you saying it would force the compiler to *always* make two passes, even if you did declare functions before using them? I don't quite get it, so is it possible you could show an example to demonstrate why? – Brian Bi Jul 24 '14 at 05:39
  • 19
    Adding it now could be a change that breaks valid code. An overload that is a better match could be defined later, changing program behavior when this feature gets introduced. – Praetorian Jul 24 '14 at 05:40
  • 1
    @Brian: `int main() { foo(1); foo(1.0); } void foo(int i) { … } void foo(double d) { … }`. Or omit one of the variants of `foo()`; then the code calls the same function twice. Etc. The compiler can't generate any of the code in `main()` until it has read the rest of the TU. – Jonathan Leffler Jul 24 '14 at 05:41
  • 4
    @JonathanLeffler `int main() { int x = foo(1); } typedef int foo;` – M.M Jul 24 '14 at 05:42
  • @Praetorian You could just make it so that overload resolution is based only on declarations seen so far, unless there's no declaration at all, in which case the compiler has to come back for a second pass, no? – Brian Bi Jul 24 '14 at 05:42
  • @JonathanLeffler Yes, but what I'm saying is that adding this feature now wouldn't force the compiler to make two passes on *existing valid code*. For something like your example, it would have to make two passes. What's wrong with that? – Brian Bi Jul 24 '14 at 05:44
  • It's also interesting to note that class functions can call other class functions declared below them. – Mark Garcia Jul 24 '14 at 05:44
  • 1
    So we'd add more special cases to an already complicated language? :) What you suggest might resolve that issue, but I doubt it'd be deemed worth the trouble. Also, there might be other, more subtle, problems that we're not thinking of with introducing such a feature. – Praetorian Jul 24 '14 at 05:45
  • 9
    Ultimately, there's no point in disturbing 25 years of status quo — and it is probably longer than that by now. Frankly, it is a question that really doesn't need to be asked. What benefit would there be from it? And are you sure it would have no unintended consequences? I'd virtually guarantee that it has unintended consequences, and undesirable unintended consequences. – Jonathan Leffler Jul 24 '14 at 05:46
  • I am sure there would be unintended consequences. But what I'm not sure of is: (a) what precisely they might be, (b) how badly they would break existing code. Presumably badly enough to far outweigh any benefit. But imagine you are trying to convince someone who hasn't used C++ very much. – Brian Bi Jul 24 '14 at 05:48
  • There *can't* be unintended side effects of a "perfect" implementation as the proposal is to give currently erroneous code a new behaviour, but that's not to say it's easy for compiler writers to provide such an implementation - there'd need to be all sorts of rules about whether the first matching declaration/definition was used, or the best matching, and delays in knowing the match mean that e.g. `auto` and `decltype()` statements dependent on the matched function also need to be deferred, it'll have a spaghetti tree-like effect as all that propagates, not a clean extra pass. – Tony Delroy Jul 24 '14 at 06:14
  • 1
    Computer science spent a decade or more in the 1960s working out how to write compilers in one pass. You won't find any support whatsoever to reversing that. Legend has it that IBM had 36 people on a compiler project that ended up with - guess what - 36 passes? We really don't want to go back there. – user207421 Jul 24 '14 at 06:17
  • 1
    @EJP, about: »We really don't want to go back there.« Aren't we back there already with Java and C♯? – MFH Jul 24 '14 at 07:57
  • 4
    @EJP: wat. Most modern languages *rely* on the compiler doing multiple passes. Do you know the *reason* Computer Science tried so hard to avoid it back in the 60's? It wasn't due to purity or elegance or nice theoretical advantages. It was because they didn't have enough memory to load all the source code at once. So they had to load it progressively, and unload old chunks once they were done with. In such a scenario, doing everything in a single pass makes a lot of sense. They simply couldn't *afford* to look back at code that had already been compiled. In 2014? Not so much. – jalf Jul 24 '14 at 09:52
  • @JonathanLeffler Is it actually legal to declare an overload of a function after a use of the overload set? That's exactly the sort of thing I'd expect C++ to explicitly disallow. – Stuart Olsen Jul 24 '14 at 14:13
  • [Every feature starts with a score of -100.](http://blogs.msdn.com/b/ericgu/archive/2004/01/12/57985.aspx) This might add *small* value to future code, but certainly nothing that would make the implementation a higher priority than other work. – abelenky Jul 24 '14 at 15:02

2 Answers2

4

Template parsing

Consider the following line of code:

a < b , c > d;

How would you parse this? There are actually two ways, depending on what a, b, c and d are. Firstly, a variable declaration

a<b,c>   d;
^^^^^^   ^
 Type   Var

in the case that a is a known template type, b and c are other known types. Secondly,

   a<b   ,   c<d ;
   ^^^       ^^^ 
boolean expressions

in the case that a, b, c and d are all variables of some sort.

The vexing parse

Or here another one:

a b(c); // is 'b' a function or a variable?

This could be a function declaration (a function with return type a and argument type c) or a variable definition (whose type is a and whose constructor argument is c).

Conclusion

There's a lot of stuff like that, unfortunately. I'm not sure, if it would be impossible to write a compiler that can deal with that kind of stuff, but it would at least be very hard to write one. Compilation times are a serious issue in C++ already. This would only make it worse. Also: It is good practice to only use what you have defined or declared already, even in other languages.

Constraints of the committee

Even if it would be reasonably possible to implement this feature, it would kill backwards compatibility. Function overloading resolution only takes prior declarations into account and the interpretation of function calls may change depending on the place a function call is written. That's the way C++ is put together. Now, the C++ standards committee is big on back-wards compatibility. In particular: They do not want to break any existing code (and rightly so). Your proposal would surely break existing code which is a no-go for the language designers.

Ralph Tandetzky
  • 22,780
  • 11
  • 73
  • 120
1

The current answer is because it would be unparseable.

Consider two-phase name lookup for templates, and in particular the need for typename. In templates, type-dependent names may not have been declared yet. To be able to parse them, we absolutely need typename. Without that, parsing would grind to a halt and we can't reliably proceed, so we couldn't even provide the type needed to fix the parsing problem. It's a chicken and egg problem: If we need to have parsed line 10 to parse line 5, line 10 will never be parsed because we break at line 5. typename helps us get past line 5 so we can learn the actual type on line 10.

Here, we'd have a similar problem. Consider this code under your assumptions:

struct Foo { };
int bar () { return Foo(); }
int Foo () { return 42; }

To parse this code, we need to know whether Foo denotes a type or function.

Community
  • 1
  • 1
MSalters
  • 173,980
  • 10
  • 155
  • 350
  • I can see how a chicken-and-egg problem could be, well, problematic, but I can't see how your example illustrates this for my question. First, there could be a rule that a name is interpreted according to preceding declarations, so in this case, `Foo` in `bar` would unambiguously be a type. Second, even if the compiler has no idea whether `Foo` is a function or a type, it can still find the terminating semicolon, and manage to parse the declaration of `Foo` following that of `bar`. Then it could come back and make a second pass. – Brian Bi Jul 24 '14 at 07:42
  • 3
    @Brian: Note that you just described an n-pass compiler. You may have arbitrary amounts of backtrack when Foo depends on Bar depends on Baz depends on .... – MSalters Jul 24 '14 at 07:48
  • 1
    If that's the case then why is it allowed to call a function before its declaration inside a class (where the function called is another member of the class)? – Brian Bi Jul 24 '14 at 17:53