1

I am trying to figure out why it is that the signature of functions with multi-dimensional arrays as formal parameters have the first dimension as unsized, while the others are not. Actually the answer to the second part of the aforementioned statement is clear: without passing the dimension information, the compiler will not know how the multi-dimensional array will be organized in memory. What bothers me is the inconsistency.

Why not require all dimensions to be explicitly specified (including the first dimension)?

I have come up with a theory and I want to see if that is correct or not.

The most common usage of an array is a 1D array. Given that the array name is a pointer to the first element of the array, Dennis Ritchie wanted to have the exact same signature for 1D arrays, whether the array syntax was used, or the pointer syntax was used.

With the pointer syntax it was impossible to know how many elements to process in the function, without specifying the size information as well. So Dennis Ritchie forced the same signature for array syntax also by having the array be unsized (and even have the compiler completely ignore the size information for the first dimension, if it is provided). In other words you will have one formal parameter be a pointer, or an unsized array, and the second formal parameter the array size.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
Sandeep
  • 1,245
  • 1
  • 13
  • 33
  • 2
    K&R "C" was supposed to give low level access, close to the hardware. So the syntax really shows the "HOW" of things. You were supposed to know what you where doing. Nowadays it's more about abstractions and we've got things like std::vector, std::array or our own wrappers around multidimensional arrays build on top of those lower level constructs. – Pepijn Kramer Sep 20 '21 at 06:39
  • _"With the pointer syntax it was impossible to know how many elements to process in the function"_ Obviously this does not apply to null-terminated Cstrings. – Sandeep Sep 20 '21 at 06:40
  • In C++ you _can_ require all dimensions to be explicitly specified if you don't let the array decay into a pointer. Btw, did you read [What is array to pointer decay?](https://stackoverflow.com/questions/1461432/what-is-array-to-pointer-decay) – Ted Lyngmo Sep 20 '21 at 06:40
  • @Sandeep that's just a convention :) – Pepijn Kramer Sep 20 '21 at 06:40
  • A couple of observations: (1) In K&R C, you couldn't specify a variable dimension size. It you specified the size, it had to be a constant. And (2) You don't always know the size. For example, consider `strlen`. That takes a `char *`, which is equivalent to `char []`. You don't know the size at the time of the call, so requiring it wouldn't make sense. – Tom Karzes Sep 20 '21 at 06:50
  • If the dimension should be part of the function signature in all cases, a lot of functions would be impossible to write, e.g. `memcpy`, `strcpy`, etc... – Support Ukraine Sep 20 '21 at 06:53
  • https://www.bell-labs.com/usr/dmr/www/chist.html is a good read, tl;dr array types did not exist in B, arrays created cells initialized with a pointer to the first element in the array. In B, you could assign to an array - it would only change the pointer. This semantic of array being a pointer to the first element survived in C. – KamilCuk Sep 20 '21 at 07:28

3 Answers3

3

When you pass an array it decays to a pointer to the first element. This is a for convenience. Without it, passing a string literal to a function like this:

void foo(const char* str) {}

would be cumbersome:

foo("Hello world");     // the const char[12] decays into ...
foo(&"Hello world"[0]); // what would have to be written like this without the decay

If you want all dimensions to be specified, just take the address of the array and you'll get a pointer to a single element - with all dimensions specified.

Example. This function takes a pointer to a int[2][10]:

void f2d2(int (*)[2][10]) {}
int a2d[2][10];
f2d2(&a2d);      // and you call it like this

In C++ you can prevent the array decaying into a pointer to the first element by taking the array by reference.

void f2d2(int (&)[2][10]) {}

int a2d[2][10];
f2d2(a2d);      // no decay
Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
  • it's not exactly convenient, pointers to arrays would suffice to avoid copying. There true reason was that C designers wanted to re-use as much code written in **B** programming language (predecessor of C). And there were no arrays in B while there were pointers. So they invented this "array decay" trick which brings confusion to the third generation of programmers. – tstanisl Sep 20 '21 at 08:08
  • @tstanisl You still have pointers to arrays with `&arr`. The convenient part is to get a pointer to the first element when a function can handle arrays with different extent without having to do `&arr[0]`. That option still works though, just as `foo(&"Hello world"[0]);` does. – Ted Lyngmo Sep 20 '21 at 08:16
  • there is a decay in `&"Hello world"[0]`, `"Hello world"` is `char[12]` (`const` in C++) and it **decays** to `char*` to be dereferenced by `[]` making `char` and `&` to became `char*` – tstanisl Sep 20 '21 at 08:24
  • @tstanisl The point I was trying to make is that _without_ decay that syntax would be the way to get a pointer to the first element - and since a pointer to the first element is common, it's convenient to have a shorter way to "get it for free". – Ted Lyngmo Sep 20 '21 at 08:32
1

There are two quotes from the C Standard that makes using arrays more clear.

The first one is (6.3.2.1 Lvalues, arrays, and function designators)

3 Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

And the second one is (6.7.6.3 Function declarators (including prototypes))

7 A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’, where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression.

What does this mean relative to function declaration?

If you declared a function like for example

void f( int a[100] );

then the compiler will adjust the function parameter the following way

void f( int *a );

So for example these function declarations are equivalent

void f( int a[100] );
void f( int a[10] );
void f( int a[1] );
void f( int a[] );
void f( int *a );

and declare the same one function. You nay even include all these declarations in your program though the compiler can issue a message that there are redundant declarations.

Within the function the variable a has the the pointer type int *.

On the other hand, you may call the function passing arrays of different sizes. Arrays designators will be implicitly converted by the compiler to pointers to their first element. You even may pass a scalar object through a pointer to it.

So these calls of the function are all correct

int a[100];
f( a );

int a[10];
f( a );

int a[1];
f( a );

int a;
f( &a );

As a result the function has no information what array was used as an argument. So you need to declare the second function parameter that will specify the size of the passed array (if the function does not rely on a sentinel value present in the array)

For example

void f( int a[], size_t n );

If you have a multidimensional array like this

T a[N1][N2][N3]...[Nn];

then a pointer to its first element will have the type

T ( *a )[N2][N3]...[Nn];

So a function declared like

void f( T a[N1][N2][N3]...[Nn] );

is equivalent to

void f( T ( *a )[N2][N3]...[Nn] );

If there N2, N3,..Nn are integer constant expressions then the array is not a variable length array. Otherwise it is a variable length array and the function may be declared like (when the declaration os not a part of the function definition)

void f( T ( *a )[*][*]...[*] );

For such a function there is also a problem of determining of sizes of sub-arrays. So you need to declare parameters that will specify the sizes.

For example

void f( size_t n1, size_t n2, size_t n3, ..., size_t nn, T a[][n2][n3]...[nn] );

As for C++ then variable length arrays is not a standard C++ feature. Also you can declare a function parameter as having a referenced type.

For example

void f( int ( &a )[10] );

within the function in this case a is not a pointer. It denotes an array and the expression sizeof( a ) will yield the size of the whole array instead of the size of pointer.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • there is a subtle difference between an `array`-like parameter and a pointer parameter. A pointer can point to incomplete type. So `int foo(int (*x)[])` is valid while `int foo(int x[][])` is not – tstanisl Sep 20 '21 at 08:48
  • @tstanisl A good remark. The problem is that you may not dereference such a pointer. – Vlad from Moscow Sep 20 '21 at 08:49
0

It's not possible to pass arrays by value in C. You can only pass a pointer, and then in the function, index from the pointer to access the contents of the array.

Accordingly; in a function declaration, if you specify parameter with array type, that parameter is adjusted to pointer type before any further analysis. For example:

  • You write: void f( int x[][5] );
  • The compiler sees: void f( int (*x)[5] );

The language would be equally functional if the first form were outlawed, i.e. you had to define the parameter with the pointer version. It is just syntactic sugar -- which as it turns out, causes a lot of confusion to newbies but that's another story.

So to answer your question -- all array dimensions after adjustment must be specified , because otherwise the compiler doesn't know how to index the memory when accessing the array via the pointer argument passed. And it should now be clear that the faux first "dimension" is irrelevant because it is syntactic sugar that is immediately discarded.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • `It is just syntactic sugar` Do you think allowing that syntax (`int f()[5];` or `void f(int [5]);` would _copy_ the whole array) and making arrays an lvalue (`int a[5]; int b[5]; a = b;`), would break any part of the language? I mean, it would not break existing programs, because the syntax is just invalid. – KamilCuk Sep 20 '21 at 07:30
  • 1
    @KamilCuk: At least in C++, with template, that change might change meaning of existing program (probably bad example, but `template std::size_t foo(T) { return sizeof(T); }`, you have size of the pointer currently, and you propose size of array instead). – Jarod42 Sep 20 '21 at 08:31