Pointer to array of unspecified size "(*p)[]" illegal in C++ but legal in C

Question

I just found out that this is illegal in C++ (but legal in C):

#include <stdio.h>
#include <stdlib.h>
#define ARRAY_LENGTH(A) (sizeof(A) / sizeof(A[0]))

int accumulate(int n, const int (*array)[])
{
    int i;
    int sum = 0;
    for (i = 0; i < n; ++i) {
        sum += (*array)[i];
    }
    return sum;
}

int main(void)
{
    int a[] = {3, 4, 2, 4, 6, 1, -40, 23, 35};
    printf("%d\n", accumulate(ARRAY_LENGTH(a), &a));
    return 0;
}

It compiles without problems using gcc -std=c89 -pedantic but fails to compile using g++. When I try to compile it using g++ I get these error messages:

main.cpp:5:37: error: parameter 'array' includes pointer to array of unknown bound 'int []'
 int accumulate(int n, int (*array)[])
                                     ^
main.cpp: In function 'int main()':
main.cpp:18:50: error: cannot convert 'int (*)[9]' to 'int (*)[]' for argument '2' to 'int accumulate(int, int (*)[])'
     printf("%d\n", accumulate(ARRAY_LENGTH(a), &a));

I have been using this in my C code for a long time and I had no idea that it was illegal in C++. To me this seems like a useful way to document that a function takes an array whose size is not known before hand.

I want to know why this is legal C but invalid C++. I also wonder what it was that made the C++ committee decide to take it away (and breaking this compatibility with C).

So why is this legal C code but illegal C++ code?

Did the version of C that existed when C++ split off have arrays of unspecified size? I think you had to declare them as pointers in those days, and being able to use `[]` was a later addition. — Barmar, Jan 12 '15 at 08:13
C++ was split from C89 and the example compiles without problems using `gcc -std=c89 -pedantic` so I do not think that it was a later addition. — wefwefa3, Jan 12 '15 at 08:15
Note that your code should work if you convert `n` into a template parameter (`template`) and use that in the array type (`int (*array)[n]`). Also note that it is even possible (and most of the time easier) to use a reference to array instead of pointer to array: `int (&array)[n]`. Then call it with `accumulate(&a)` and let the compiler deduce `n` for you! ;) — leemes, Jan 12 '15 at 08:44
@juanchopanza Not quite. There is an "array of unknown length" type in C++, and you can get declarations of that type with `extern`. Given `extern int a[];`, `int (*b)[] = &a;` is perfectly valid. But it's such a rare corner case that functions taking arrays of unspecified length are almost certainly a mistake. As for `int (*array)[42]`, C does not allow that to point to anything other than an array of length 42 either. — , Jan 12 '15 at 10:36
@hvd If the last thing you said is true, then gcc and clang are not C standards compliant, even in pedantic mode (IIRC, they only emit warnings for code [like this one](http://ideone.com/Nmxb1D)) — juanchopanza, Jan 12 '15 at 10:44
@juanchopanza They indeed only emit a warning for that, and since the standard merely requires a diagnostic, the behaviour of the compilers is allowed by the standard. If you want to turn all standard-mandated diagnostics into an error, add `-pedantic-errors` to the command-line options. — , Jan 12 '15 at 10:48
@hvd Then I find it strange that both compilers emit an error in C++. — juanchopanza, Jan 12 '15 at 11:07
@juanchopanza In C++, it's more complicated to treat it as a warning, because the standard requires it to be treated as an error during template argument substitution, so it can occur in a valid program. — , Jan 12 '15 at 11:12
The normal way of specifying a 'pointer to an array of any size' as a function parameter is `accumulate(int n, int array[])`, which is legal (and has the desired effect) in both C and C++ — Chris Dodd, Jan 12 '15 at 16:29
@elias, could you explain _why_ you chose to let the function take a pointer to array as argument? As Chris Dodd commented above, it is unnecessary in this program. It is also rather unusual, and it can therefore be confusing to experienced programmers. — Thomas Padron-McCarthy, Jan 12 '15 at 21:30
@ChrisDodd: Except for clarity, you should write it as `accumulate(int n, int* array)` and probably have a `const` thrown on as well. — Ben Voigt, Jan 12 '15 at 22:55
@BenVoigt: Isn't clarity generally the most important thing? `const` depends on whether you are modifying the array or not. — Chris Dodd, Jan 14 '15 at 21:45
@ChrisDodd: Too bad English is ambiguous. Here's what I meant: "I agree with you except for one thing: `int array[]` is misleading. For improved clarity, you should write it as `accumulate(int n, int* array)`" And here the array is not being modified, which is why I also suggest `const`. — Ben Voigt, Jan 14 '15 at 21:55
@Barmar C's predecessor B, and "new B" which Ritchie retrospectively designated "Embryonic C", had *only* the bracket syntax `x[]` which implemented array *as* pointer. "Neonatal C" circa 1973 distinguished real but second-class array `x[]` which *converts* (in jargon "decays") to pointer `*x`, and relevant here is "rewritten" as function parameter, and this has remained unchanged since. See his HOPL2 paper at http://cm.bell-labs.com/who/dmr/chist.html . — dave_thompson_085, Jan 16 '15 at 22:34
@M.M You're right. Interestingly, https://www.bell-labs.com/usr/dmr/www/chist.html denigrates this syntax, describing it as a "living fossil" and saying that it "it serves as much to confuse the learner as to alert the reader." — Barmar, Mar 24 '16 at 17:21

score 36 · Accepted Answer · edited Jan 12 '15 at 22:56

36

Dan Saks wrote about this in 1995, during the lead up to C++ standardisation:

The committees decided that functions such as this, that accept a pointer or reference to an array with unknown bound, complicate declaration matching and overload resolution rules in C++. The committees agreed that, since such functions have little utility and are fairly uncommon, it would be simplest to just ban them. Hence, the C++ draft now states:

If the type of a parameter includes a type of the form pointer to array of unknown bound of T or reference to array of unknown bound of T, the program is ill-formed.

edited Jan 12 '15 at 22:56

Ben Voigt

277,958
43
419
720

answered Jan 12 '15 at 08:16

user2649908

557
7
15

8

The prohibition has been removed by [the resolution of CWG issue 393](http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#393), adopted at the last committee meeting. – T.C. Jan 13 '15 at 05:14

score 30 · Answer 2 · 2015-01-12T10:37:30.420

C++ doesn't have C's notion of "compatible type". In C, this is a perfectly valid redeclaration of a variable:

extern int (*a)[];
extern int (*a)[3];

In C, this is a perfectly valid redeclaration of the same function:

extern void f();
extern void f(int);

In C, this is implementation-specific, but typically a valid redeclaration of the same variable:

enum E { A, B, C };
extern enum E a;
extern unsigned int a;

C++ doesn't have any of that. In C++, types are either the same, or are different, and if they are different, then there is very little concern in how different they are.

Similarly,

int main() {
  const char array[] = "Hello";
  const char (*pointer)[] = &array;
}

is valid in C, but invalid in C++: array, despite the [], is declared as an array of length 6. pointer is declared as a pointer to an array of unspecified length, which is a different type. There is no implicit conversion from const char (*)[6] to const char (*)[].

Because of that, functions taking pointers to arrays of unspecified length are pretty much useless in C++, and almost certainly a mistake on the part of the programmer. If you start from a concrete array instance, you almost always have the size already, so you cannot take its address in order to pass it to your function, because you would have a type mismatch.

And there is no need for pointers to arrays of unspecified length in your example either: the normal way to write that in C, which happens to also be valid in C++, is

int accumulate(int n, int *array)
{
    int i;
    int sum = 0;
    for (i = 0; i < n; ++i) {
        sum += array[i];
    }
    return sum;
}

to be called as accumulate(ARRAY_LENGTH(a), a).

There is an open [EWG issue](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4325.html#118) to permit the conversion. — T.C., Jan 13 '15 at 05:16
@T.C. Ah, that's nice to know. I guess that if it does become permitted, it'll probably only be permitted in one direction. An implicit conversion from `char(*)[6]` to `char(*)[]` is safe, but an implicit conversion from `char(*)[]` to `char(*)[6]` is not. Because in C, there isn't even any conversion (the types are simply compatible), you can write code like `int main() { int array[6]; int (*ptr1)[] = &array; int (*ptr2)[100] = ptr1; }` which typically does not even get any compiler warning, let alone an error. — , Jan 13 '15 at 07:56

Pointer to array of unspecified size "(*p)[]" illegal in C++ but legal in C

2 Answers2

Linked

Related