1

Declare an array in your include file omitting the first dimension size:

extern float mvp[][4];

Then define the array following the previous declaration in a translation unit:

float mvp[4][4];

No problem. Until you try to get the size of that array in a file which includes the first declaration. Then you would get:

error: invalid application of 'sizeof' to an incomplete type 'float [][4]'

I understand that arrays decays into pointers to their first element when used as lvalue, that array declarations in function prototypes are actually pointers in disguise but here it's not the case. But the first declaration does not declare a pointer, it declares an "incomplete array type" different from:

extern float (*mvp)[4];

When declaring variables, the compiler just reference a "dummy" base address offset and the associated type that the linker will resolve.

I wonder why this "incomplete array type" – which cannot be incremented like a pointer to array but is also not fully an array since its size cannot be retrieved – would be allowed to exist ?

Why not implicitly convert it to a pointer (just a base address offset) or even better, why not throw an error for omitting the size in the first dimension ?


Quoting this

If expression in an array declarator is omitted, it declares an array of unknown size. Except in function parameter lists (where such arrays are transformed to pointers) and when an initializer is available, such type is an incomplete type (note that VLA of unspecified size, declared with * as the size, is a complete type)

So really, the type is incomplete and waiting to be completed later by a later declaration or tentative definition.

explogx
  • 1,159
  • 13
  • 28
  • 1
    "I understand that arrays decays into pointers to their first element when used as lvalue" - I don't think so. You probably meant they decay to pointers when passed as a function argument? – Michael Beer Dec 24 '18 at 19:12
  • When you use the name of an array as an lvalue, that means, you want to reference the memory storage and not the type (and you don't want to create an rvalue either) then the **array will decay into a pointer to its first element**. – explogx Dec 24 '18 at 19:15
  • 1
    @MichaelBeer: They decay to pointers in almost any situation (though not quite as the question describes). You're thinking of a different mechanism, where function parameters declared with array type are automatically defined as being of pointer type instead. – user2357112 Dec 24 '18 at 19:16
  • @user2357112 `int b; int a[]; a = &b;` - `a` is used as an lvalue here, but a conforming compiler should not allow this, imho. – Michael Beer Dec 24 '18 at 19:25
  • @MichaelBeer: Yeah, that's invalid. I'm thinking of the mechanism that makes arrays decay to pointers when you do something like `pointer = array` or `array[5]` (yes, that involves decay), while you're thinking of the mechanism that converts `int foo(int arg[5])` to `int foo(int *arg)`. – user2357112 Dec 24 '18 at 19:28
  • The conditions in which array decay happens are described in the [standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf) as follows: "Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue." It's not about the array being "used as lvalue". – user2357112 Dec 24 '18 at 19:31
  • @Prion I don't think this hold s in general: `int a[] = {1,2}; a = (int []){3, 2};` should be proper C11, isn't it? And yet the array still does not decay... – Michael Beer Dec 24 '18 at 19:37
  • In the case of `int a[5]; a = &foo;`, 'a' is not a modifiable l-value and that is exactly the error you get. From the language spec: `A modifiable lvalue is an lvalue that does not have array type, ...`. Also https://stackoverflow.com/questions/45656162/why-cant-a-modifiable-lvalue-have-an-array-type – fdk1342 Dec 24 '18 at 19:39
  • The *only* case where an array expression *is* an lvalue in a conforming program is as the operand of `&`, because `&`, `_Alignof` and `sizeof` are the only cases where it does not go through value conversion, and of these `&` is the one that really cares about lvalues. – Antti Haapala -- Слава Україні Dec 24 '18 at 19:53
  • @MichaelBeer `int a[] = {1,2}; a = (int []){3, 2};` isn't proper `c` because it's not a modifiable lvalue and on `gcc -std=c11` returns a `error: assignment to expression with array type`. – fdk1342 Dec 24 '18 at 19:57
  • @Fred You are right indeed... – Michael Beer Dec 24 '18 at 20:04
  • “A modifiable lvalue is an lvalue that **does not have array type**, […]” *From the C standard* – explogx Dec 24 '18 at 20:20
  • “Except when it is the operand of `sizeof` or `&` operators, […] the type "array of type" is implicitly converted to the type "pointer to type" that points to the initial element of the array object and **that is not an lvalue**.” *From the C standard* – explogx Dec 24 '18 at 20:20

2 Answers2

3

Using extern doesn't make things exist it just used to state that something may exist in a different translation unit. sizeof() can only be used on complete types. This has nothing to do with array pointer decay. extern float (*mvp)[4] is a complete type, it is a pointer to an array of 4 floats. extern float mvp[][4] is incomplete it is a 2D array of floats where one of the dimension is unspecified. These are two very different things. In either case mvp can be used as an array, when using correct syntax, but you can only use sizeof if it can actually determine its size.

Also float mvp[][4] is an array, it's just that its size is indeterminate. What makes it an array is that it's memory is laid out like an array.

fdk1342
  • 3,274
  • 1
  • 16
  • 17
2

It is possible to declare all dimensions of the extern array:

extern float mvp[4][4];

It is just an option to leave the external declaration incomplete and let the definition worry about the dimension. It is useful exactly because the size is not part of its external interface! Should the outermost size change from compilation to another then a translation unit that merely uses the object need not be recompiled.

For this to work, there should probably be a sentinel value that ends the array / a variable that would tell how many elements there are, otherwise it is not very useful.


Why not implicitly convert it to a pointer (just a base address offset) or even better, why not throw an error for omitting the size in the first dimension?

It cannot be converted to a pointer because the declaration is not a definition. It just tells that such an object does exist. The definition of that object exists independent of the external declaration. The actual object that is being declared here is an array, not a pointer.

It is just that in case of arrays the external declaration can declare the outermost dimension or can omit it.


As for the claim that

arrays decays into pointers to their first element when used as lvalue

that is quite wrong. An array expression is an lvalue, and when it decays it is no longer an lvalue - the only case where it stays as an lvalue is as the operand of &.

  • Then why not implicitly convert the incomplete type into a pointer, which is basically what it is: a base address pointing to the beginning of the array memory area. This incomplete array type is strange as it behaves like an array type in the sense that you can't increment it, but you can't retrieve its size either. So why not just convert it into a pointer ? – explogx Dec 24 '18 at 19:00
  • @Prion: Among other possible reasons, code compiled expecting `mvp` to be a pointer isn't compatible with a definition that defines `mvp` as an array. – user2357112 Dec 24 '18 at 19:12
  • I know. Because the types are not the same, while a pointer to an array of N elements is a complete type, the "incomplete" array type is not. The incomplete array type is just a way of telling that a complete array type exists somewhere and thus must be resolved by the linker. – explogx Dec 24 '18 at 19:19
  • @Prion A pointer to an object even an array of indeterminate size is a complete type. That's because the size of a pointer is always the same irregardless of what it points to. What is it you expect `sizeof` to do in regards to this situation. It seems you want it to turn into a pointer or something and report the wrong size. The only time `sizeof` can report the size of an array is in the translation unit that defines that array. – fdk1342 Dec 24 '18 at 20:13
  • @Fred "The only time sizeof can report the size of an array is in the translation unit that defines that array." - that's not correct either, it is enough to declare the array with a complete type – Antti Haapala -- Слава Україні Dec 24 '18 at 20:19
  • @AnttiHaapala You are correct that using `extern int a[5][5]` will have `sizeof(a)` return a correct value. What I was trying to say is that `sizeof` can't discern the proper size of an external array in the way @prion is suggesting it should be able to by converting it to a pointer. Its like trying to say `sizeof(int *)` should return the same size of `sizeof(int [5][5])` because arrays decay into pointers. – fdk1342 Dec 24 '18 at 20:29
  • A pointer to an array of indeterminate size (ie. a pointer to an incomplete array type, for instance `int (*)[]`) is a complete type because the pointer is just 8 bytes of memory referencing a memory address. You can dereference a pointer to an incomplete array type at index zero and it will give you the base address of the memory area containing this array of indeterminate size. But if you want to access an index different from zero, then you must cast into a complete type to give the ability to the compiler to compute the correct offset. – explogx Dec 24 '18 at 20:56
  • Also, I can't see any reasons to use a pointer to an array of indeterminate size only to use the zero index, since dereferencing the pointer will give you an array of indeterminate size, **that is not an lvalue** thus not assignable, and that will decay into a pointer to its first element. So why use `int (*)[]` and not `int*` ? – explogx Dec 24 '18 at 21:08
  • @Prion No, This isn't true. If you have a pointer to an array of indeterminate size to access a location of that array you dereference the pointer and then index the array: `(*ptr)[5]`; There isn't a whole lot of reason that I know of to use `int (*ptr)[]` but it certainly is permissible in the language. Of course there is good reason to use arrays of indeterminate size, `argv` is a good example. – fdk1342 Dec 24 '18 at 22:41
  • `char *argv[]` is not an array of indeterminate size since it decays into a pointer to pointer to character. – explogx Dec 24 '18 at 22:59
  • @Prion Then why does `argc` exist? It's to tell you the number of elements in the array of `argv`. So by definition it is an array is of indeterminate size, you don't know the size of the array (the number of elements nor the overall size in bytes)! And if you use `sizeof(argv)` the compiler may generate a warning that `sizeof` doesn't return the size of the array but the size of a pointer. This really doesn't belong any more in comments and perhaps in chat. Here is some additional reading https://stackoverflow.com/questions/1461432/what-is-array-decaying – fdk1342 Dec 24 '18 at 23:17
  • I know. I quoted the C standard in a comment below my question. – explogx Dec 24 '18 at 23:19
  • @Prion: Addressing some of the things in the comments: “why not implicitly convert the incomplete type into a pointer”: When `int mvp[][4]` is an array (of size not known in the current translation unit), then, for expressions using it, compiler merely has to generate a reference to `mvp` and let the linker fill them in. If `int mvp[][4]` were a pointer, the compiler would have to reserve space for a pointer. And then it would have to have the linker fill that it, and it would have to load the address from that space instead of letting the linker fill in the references. – Eric Postpischil Dec 24 '18 at 23:37
  • 1
    @Prior: Furthermore, while `mvp` would end up being a pointer in most expressions, so `mvp[i]` would work either way, the types of `&mvp` would be different. One would be a pointer to an array, the other would be a pointer to a pointer. C semantics would change. Initially, there might be an initial declaration of `mvp[][4]` in a translation unit followed by a later `mvp[5][4]` that resolves the incompleteness. If `mvp[][4]` were a pointer, this would not be possible. – Eric Postpischil Dec 24 '18 at 23:38
  • @Prion: Re “But if you want to access an index different from zero, then you must cast into a complete type to give the ability to the compiler to compute the correct offset”: Given `int mvp[][4];`, both `mvp[0][4]` and `mvp[1][4]` are valid expressions. Since the size of the array of `int` is known to be 4 elements, the information needed for indexing or pointer arithmetic is present. It is the size of the array of arrays that is not known, so you could not do `&mvp+1`. Even `&mvp+0` would not be allowed. – Eric Postpischil Dec 24 '18 at 23:42
  • @Prion you cannot access zero with indexing either... https://stackoverflow.com/a/53920370/918959 – Antti Haapala -- Слава Україні Dec 25 '18 at 07:46