43

On Wikipedia I found this:

A a( A() );

[This] could be disambiguated either as

  1. a variable definition of class [A], taking an anonymous instance of class [A] or
  1. a function declaration for a function which returns an object of type [A] and takes a single (unnamed) argument which is a function returning type [A] (and taking no input).

Most programmers expect the first, but the C++ standard requires it to be interpreted as the second.

But why? If the majority of the C++ community expects the former behavior, why not make it the standard? Besides, the above syntax is consistent if you don't take into account the parsing ambiguity.

Can someone please enlighten me? Why does the standard make this a requirement?

template boy
  • 10,230
  • 8
  • 61
  • 97

6 Answers6

26

Let's say MVP didn't exist.

How would you declare a function?

A foo();

would be a variable definition, not a method declaration. Would you introduce a new keyword? Would you have a more awkward syntax for a function declaration? Or would you rather have

A foo;

define a variable and

A foo();

declare a function?

Your slightly more complicated example is just for consistency with this basic one. It's easier to say "everything that can be interpreted as a declaration, will be interpreted as a declaration" rather than "everything that can be interpreted as a declaration, will be interpreted as a declaration, unless it's a single variable definition, in which case it's a variable definition".

This probably isn't the motivation behind it though, but a reason it's a good thing.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
17

For C++, it's pretty simple: because the rule was made that way in C.

In C, the ambiguity only arises with a typedef and some fairly obscure code. Almost nobody ever triggers it by accident -- in fact, it probably qualifies as rare except in code designed specifically to demonstrate the possibility. For better or worse, however, the mere possibility of the ambiguity meant somebody had to resolve it -- and if memory serves, it was resolved by none other than Dennis Ritchie, who decreed that anything that could be interpreted as a declaration would be a declaration, even if there was also an ambiguous interpretation as a definition.

C++ added the ability to use parentheses for initialization as well as function calls as grouping, and this moved the ambiguity from obscure to common. Changing it, however, would have required breaking the rule as it came from C. Resolving this particular ambiguity as most would expect, without creating half a dozen more that were even more surprising would probably have been fairly non-trivial as well, unless you were willing to throw away compatibility with C entirely.

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • I was just doing office hours for an undergrad C++ class, and one of the students had this in his assignment :) – Aleksei Petrenko Feb 02 '21 at 06:03
  • 1
    How do you get the ambiguity in C? Even with typedefs, the usual C++ examples don't have C analogues because C doesn't have function-style cast notation. – Brian Bi Nov 16 '21 at 00:26
  • @BrianBi: https://pdos.csail.mit.edu/archive/l/c/roskind.html has a fairly good exposition of the situation. – Jerry Coffin Nov 16 '21 at 18:19
  • Those examples are not actually ambiguous once the scopes of prior declarations are taken into account; with `T(*b)[4];` for example, you just need to know whether the declaration of `T` that is visible at that point is of a *typedef-name* or not. But the C++ vexing parses are still ambiguous even when you know whether something is a type. (1/2) – Brian Bi Nov 16 '21 at 19:12
  • The C++ standard decrees by fiat that in those situations, an interpretation as a declaration is favoured over an interpretation as an expression (this also applies to the declaration/definition case, because it resolves the parenthesized comma-separated components as parameter declarations). But Stroustrup could easily have decided on the opposite rule; it would have involved changing a few words in the standard, and it would have left the C cases undisturbed. So, compatibility with C cannot be the reason why Stroustrup decided to favour declarations over expressions. (2/2) – Brian Bi Nov 16 '21 at 19:21
  • @BrianBi: Sorry, I should have read that more carefully--it doesn't actually show what I was after. I'll do some more looking to see if I can find a demonstration of the real problem. Unfortunately, when I first learned about it, it was from reading a magazine (probably Dr. Dobbs) so finding it may not be quick or easy. – Jerry Coffin Nov 16 '21 at 19:50
  • Mulling it over in my head, I think I remember the scenario. Consider code like: `int t; typedef long T; { (T)t; }` Is the `(T) t;` casting `t` to long, then discarding the result, or is it defining a local variable named `t` of type `long`, with redundant parens around the type name. And Dennis Ritchie declared by fiat that this would always be interpreted as a declaration, not an expression. – Jerry Coffin Nov 16 '21 at 19:59
  • I don't think that scenario is ambiguous either. The grammar only allows redundant parentheses in declarators, not around type specifiers. `(T)t` can only be a cast. – Brian Bi Nov 16 '21 at 20:13
  • I've been thinking about this a bit more, and I've concluded that since there's no disambiguation rule in C89, the disambiguation rule you're referring to must have something to do with pre-standard C, and the ambiguity is most likely to have been caused by syntax that uses implicit `int` (banned in C89). Does this ring a bell? – Brian Bi Nov 22 '21 at 21:52
  • Actually, I forgot that C89 still has implicit `int`. But I believe C89 banned it in some contexts (maybe parsing difficulties had something to do with this) – Brian Bi Nov 22 '21 at 22:13
  • 1
    @BrianBi: With some more thought, (and looking) I've finally found the actual rule (§6.7.5.3/11): "In a parameter declaration, a single typedef name in parentheses is taken to be an abstract declarator that specifies a function with a single parameter, not as redundant parentheses around the identifier for a declarator." I didn't remember the situation where it's allowed, but it is about a typedef name surrounded by redundant parens. Oh, and this is still present at least up through C99. I'm not sure that's the only instance, but it is one, anyway. – Jerry Coffin Nov 23 '21 at 04:54
  • Right, this covers the situation `typedef int T; void foo(int(T));` (equivalent to `void foo(int(int))`, not `void foo(int T)`). But this doesn't seem to have relevance to the other ambiguous cases that exist in C++. – Brian Bi Nov 23 '21 at 13:10
6

This is just a guess, but it may be due to the fact that with the given approach you can get both behaviors:

A a( A() ); // this is a function declaration
A a( (A()) ); // this is a variable definition

If you were to change its behavior to be a variable definition, then function declarations would be considerably more complex.

typedef A subfunction_type();

A a( A() ); // this would be a variable declaration
A a( subfunction_type ); // this would be a function declaration??
K-ballo
  • 80,396
  • 20
  • 159
  • 169
  • Not quite correct, `A a( A() );` is a function declaration. It may be reworded as `A a (A (*)())`, that is, a function named `a` and returning an object of type `A` with a parameter that is a function: `void -> A`. Try compiling it for the simplest check) – gluk47 Apr 03 '16 at 10:09
6

It's a side-effect of the grammar being defined recursively.

It was not designed intentionally like that. It was discovered and documented as the most vexing parse.

Martin York
  • 257,169
  • 86
  • 333
  • 562
  • 1
    Can you elaborate on that? – Tanveer Badar Jul 17 '20 at 06:32
  • 1
    @TanveerBadar They did not design the language intentionally to be convoluted, it was supposed to be logical. Though because of the complexity of C++ in general a few things got past the original designer and we are now stuck with them. The "Most Vexing Parse" is just simply a name given to a common issue that was found after the language had been used for a while. We can not break backwards compatibility so we can't fix the language (not directly, though we have added to the language to make this less of an issue). – Martin York Jul 17 '20 at 06:44
2

Consider if the program were like so:

typedef struct A { int m; } A;
int main() { A a( A() ); }

This would be valid C, and there is only one possible interpretation allowed by the grammar of C: a is declared as a function. C only allows initialization using = (not parentheses), and does not allow A() to be interpreted as an expression. (Function-style casts are a C++-only feature.) This is not a "vexing parse" in C.

The grammar of C++ makes this example ambiguous, as Wikipedia points out. However, if you want C++ to give this program the same meaning as C, then, obviously, C++ compilers are going to have to interpret a as a function just like C compilers. Sure, C++ could have changed the meaning of this program, making a the definition of a variable of type A. However, incompatibilities with C were introduced into C++ only when there was a good reason to do it, and I would imagine that Stroustrup particularly wanted to avoid potentially silent breakages such as this, as they would cause great frustration for C users migrating to C++.

Thus, C++ interprets it as a function declaration too, and not a variable definition; and more generally, adopted the rule that if something that looks like a function-style cast can be interpreted as a declaration instead in its syntactic context, then it shall be. This eliminates potential for incompatibility with C for all vexing-parse situations, by ensuring that the interpretation that is not available in C (i.e. the one involving a function-style cast) is not taken.

Cfront 2.0 Selected Readings (page 1-42) mentions the C compatibility issue in the case of expression-declaration ambiguity, which is a related type of most vexing parse.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
1

No particular reason, other than [possibly] the case that K-ballo identifies.

It's just legacy. There was already the int x; construction form so it never seemed like a reach to require T x; when no ctor args are in play.

In hindsight I'd imagine that if the language were designed from scratch today, then the MVP wouldn't exist... along with a ton of other C++ oddities.

Recall that C++ evolved over decades and, even now, is designed only by committee (see also: camel).

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055