35

I just found out that this is illegal in C++ (but legal in C):

#include <stdio.h>
#include <stdlib.h>
#define ARRAY_LENGTH(A) (sizeof(A) / sizeof(A[0]))

int accumulate(int n, const int (*array)[])
{
    int i;
    int sum = 0;
    for (i = 0; i < n; ++i) {
        sum += (*array)[i];
    }
    return sum;
}

int main(void)
{
    int a[] = {3, 4, 2, 4, 6, 1, -40, 23, 35};
    printf("%d\n", accumulate(ARRAY_LENGTH(a), &a));
    return 0;
}

It compiles without problems using gcc -std=c89 -pedantic but fails to compile using g++. When I try to compile it using g++ I get these error messages:

main.cpp:5:37: error: parameter 'array' includes pointer to array of unknown bound 'int []'
 int accumulate(int n, int (*array)[])
                                     ^
main.cpp: In function 'int main()':
main.cpp:18:50: error: cannot convert 'int (*)[9]' to 'int (*)[]' for argument '2' to 'int accumulate(int, int (*)[])'
     printf("%d\n", accumulate(ARRAY_LENGTH(a), &a));

I have been using this in my C code for a long time and I had no idea that it was illegal in C++. To me this seems like a useful way to document that a function takes an array whose size is not known before hand.

I want to know why this is legal C but invalid C++. I also wonder what it was that made the C++ committee decide to take it away (and breaking this compatibility with C).

So why is this legal C code but illegal C++ code?

5gon12eder
  • 24,280
  • 5
  • 45
  • 92
wefwefa3
  • 3,872
  • 2
  • 29
  • 51
  • 1
    Did the version of C that existed when C++ split off have arrays of unspecified size? I think you had to declare them as pointers in those days, and being able to use `[]` was a later addition. – Barmar Jan 12 '15 at 08:13
  • C++ was split from C89 and the example compiles without problems using `gcc -std=c89 -pedantic` so I do not think that it was a later addition. – wefwefa3 Jan 12 '15 at 08:15
  • Note that your code should work if you convert `n` into a template parameter (`template`) and use that in the array type (`int (*array)[n]`). Also note that it is even possible (and most of the time easier) to use a reference to array instead of pointer to array: `int (&array)[n]`. Then call it with `accumulate(&a)` and let the compiler deduce `n` for you! ;) – leemes Jan 12 '15 at 08:44
  • @juanchopanza Not quite. There is an "array of unknown length" type in C++, and you can get declarations of that type with `extern`. Given `extern int a[];`, `int (*b)[] = &a;` is perfectly valid. But it's such a rare corner case that functions taking arrays of unspecified length are almost certainly a mistake. As for `int (*array)[42]`, C does not allow that to point to anything other than an array of length 42 either. –  Jan 12 '15 at 10:36
  • @hvd If the last thing you said is true, then gcc and clang are not C standards compliant, even in pedantic mode (IIRC, they only emit warnings for code [like this one](http://ideone.com/Nmxb1D)) – juanchopanza Jan 12 '15 at 10:44
  • @juanchopanza They indeed only emit a warning for that, and since the standard merely requires a diagnostic, the behaviour of the compilers is allowed by the standard. If you want to turn all standard-mandated diagnostics into an error, add `-pedantic-errors` to the command-line options. –  Jan 12 '15 at 10:48
  • @hvd Then I find it strange that both compilers emit an error in C++. – juanchopanza Jan 12 '15 at 11:07
  • @juanchopanza In C++, it's more complicated to treat it as a warning, because the standard requires it to be treated as an error during template argument substitution, so it can occur in a valid program. –  Jan 12 '15 at 11:12
  • 1
    The normal way of specifying a 'pointer to an array of any size' as a function parameter is `accumulate(int n, int array[])`, which is legal (and has the desired effect) in both C and C++ – Chris Dodd Jan 12 '15 at 16:29
  • @elias, could you explain _why_ you chose to let the function take a pointer to array as argument? As Chris Dodd commented above, it is unnecessary in this program. It is also rather unusual, and it can therefore be confusing to experienced programmers. – Thomas Padron-McCarthy Jan 12 '15 at 21:30
  • @ChrisDodd: Except for clarity, you should write it as `accumulate(int n, int* array)` and probably have a `const` thrown on as well. – Ben Voigt Jan 12 '15 at 22:55
  • @BenVoigt: Isn't clarity generally the most important thing? `const` depends on whether you are modifying the array or not. – Chris Dodd Jan 14 '15 at 21:45
  • 2
    @ChrisDodd: Too bad English is ambiguous. Here's what I meant: "I agree with you except for one thing: `int array[]` is misleading. For improved clarity, you should write it as `accumulate(int n, int* array)`" And here the array is not being modified, which is why I also suggest `const`. – Ben Voigt Jan 14 '15 at 21:55
  • I have added `const` to the function parameter now. – wefwefa3 Jan 15 '15 at 08:53
  • @Barmar C's predecessor B, and "new B" which Ritchie retrospectively designated "Embryonic C", had *only* the bracket syntax `x[]` which implemented array *as* pointer. "Neonatal C" circa 1973 distinguished real but second-class array `x[]` which *converts* (in jargon "decays") to pointer `*x`, and relevant here is "rewritten" as function parameter, and this has remained unchanged since. See his HOPL2 paper at http://cm.bell-labs.com/who/dmr/chist.html . – dave_thompson_085 Jan 16 '15 at 22:34
  • @Barmar C had this feature since K&R1, if not earlier – M.M Mar 24 '16 at 12:45
  • @M.M You're right. Interestingly, https://www.bell-labs.com/usr/dmr/www/chist.html denigrates this syntax, describing it as a "living fossil" and saying that it "it serves as much to confuse the learner as to alert the reader." – Barmar Mar 24 '16 at 17:21

2 Answers2

36

Dan Saks wrote about this in 1995, during the lead up to C++ standardisation:

The committees decided that functions such as this, that accept a pointer or reference to an array with unknown bound, complicate declaration matching and overload resolution rules in C++. The committees agreed that, since such functions have little utility and are fairly uncommon, it would be simplest to just ban them. Hence, the C++ draft now states:

If the type of a parameter includes a type of the form pointer to array of unknown bound of T or reference to array of unknown bound of T, the program is ill-formed.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
user2649908
  • 557
  • 7
  • 15
  • 8
    The prohibition has been removed by [the resolution of CWG issue 393](http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#393), adopted at the last committee meeting. – T.C. Jan 13 '15 at 05:14
30

C++ doesn't have C's notion of "compatible type". In C, this is a perfectly valid redeclaration of a variable:

extern int (*a)[];
extern int (*a)[3];

In C, this is a perfectly valid redeclaration of the same function:

extern void f();
extern void f(int);

In C, this is implementation-specific, but typically a valid redeclaration of the same variable:

enum E { A, B, C };
extern enum E a;
extern unsigned int a;

C++ doesn't have any of that. In C++, types are either the same, or are different, and if they are different, then there is very little concern in how different they are.

Similarly,

int main() {
  const char array[] = "Hello";
  const char (*pointer)[] = &array;
}

is valid in C, but invalid in C++: array, despite the [], is declared as an array of length 6. pointer is declared as a pointer to an array of unspecified length, which is a different type. There is no implicit conversion from const char (*)[6] to const char (*)[].

Because of that, functions taking pointers to arrays of unspecified length are pretty much useless in C++, and almost certainly a mistake on the part of the programmer. If you start from a concrete array instance, you almost always have the size already, so you cannot take its address in order to pass it to your function, because you would have a type mismatch.

And there is no need for pointers to arrays of unspecified length in your example either: the normal way to write that in C, which happens to also be valid in C++, is

int accumulate(int n, int *array)
{
    int i;
    int sum = 0;
    for (i = 0; i < n; ++i) {
        sum += array[i];
    }
    return sum;
}

to be called as accumulate(ARRAY_LENGTH(a), a).

  • 1
    There is an open [EWG issue](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4325.html#118) to permit the conversion. – T.C. Jan 13 '15 at 05:16
  • 2
    @T.C. Ah, that's nice to know. I guess that if it does become permitted, it'll probably only be permitted in one direction. An implicit conversion from `char(*)[6]` to `char(*)[]` is safe, but an implicit conversion from `char(*)[]` to `char(*)[6]` is not. Because in C, there isn't even any conversion (the types are simply compatible), you can write code like `int main() { int array[6]; int (*ptr1)[] = &array; int (*ptr2)[100] = ptr1; }` which typically does not even get any compiler warning, let alone an error. –  Jan 13 '15 at 07:56