What does prevent a compiler from deducing the size of column?

Question

Why do we need the column size for

int arr[][] = { {1,2,3},{1,3,5} };//int arr[][3]

In this particular example, why can't the compiler deduce it from the data that it has to pack only 3 elements each into each row? Why is there a limitation on compiler for doing it?

I can understand if it was

int arr[][] = { 1,2,3,1,3,5 };

then the compiler has no information of how much data has to be packed into each row.

I read a similar question Why do we need to specify the column size when passing a 2D array as a parameter?. But it doesn't contain the answer.

Edit: To avoid confusion, I am talking about the exact format of data mentioned above.

C and C++ have some things in common, but they are two different languages with different rules when it comes to details. Please pick one. — 463035818_is_not_an_ai, Aug 07 '21 at 15:55
Finally, somebody decided what the compiler has to be able to do and what not. There will often be something left where you may think: Actually, the compiler could find out this as well - and if at least only under certain conditions... -- If the other (linked) Q/A wasn't sufficient to convince you, I'm uncertain what to add. — Scheff's Cat, Aug 07 '21 at 15:55
@463035818_is_not_a_number: Did as you said, but this question makes any difference for them? — Sreeraj Chundayil, Aug 07 '21 at 15:55
i am not sure if there is a difference with respect to that. But as soon as someone quotes from the standard it is either one or the other, unless they do the extra work of checking both — 463035818_is_not_an_ai, Aug 07 '21 at 15:58
_What is special about int a[] = {1,2,3,4}_ -> That `int` determines exactly the size of an element. What size has an `int[]`? — Scheff's Cat, Aug 07 '21 at 15:58
Suppose you write: `int arr[][] = { {1}, {1,3,5} };`. What is the compiler supposed to do then? — Paul Sanders, Aug 07 '21 at 15:59
@PaulSanders The OP carefully used an initializer where all "sub-initializers" have equal length. He denoted that the compiler could do it in this special case... :-) — Scheff's Cat, Aug 07 '21 at 16:00
@PaulSanders: Could have many possibilities right now. But if standard can say it has to be like this then there shouldn't be any problem. — Sreeraj Chundayil, Aug 07 '21 at 16:01
@InQusitive The reason the compiler doesn't deduce the column size is because the C language was designed for simple **single-pass** compilers that had to work on machines with **extremely limited memory**. If the compiler were to deduce the column size, it would either need two passes, or it would need to store the entire array in memory. — user3386109, Aug 07 '21 at 17:34
@user3386109 : Why two pass? I just started about compilers. So may not be able to understand it. BTW this is the only relevant comment I found here. — Sreeraj Chundayil, Aug 07 '21 at 17:38
@InQusitive I guess I'm assuming that a compiler that deduces the array size would be expected to deduce the size of `int arr[][] = { {1}, {1,3,5} };` as `arr[2][3]`. So it would need to read the entire initializer before deducing the size. In that case, it either A) needs to store the array values in memory while deducing the size, or B) needs to use one pass to deduce the size, and a second pass to copy the initializers into the executable file. — user3386109, Aug 07 '21 at 18:06

273K · Answer 1 · 2021-08-07T17:48:11.317

Because you may want the array size to be greater than it is deduced. Like
```
int arr[5] = {1, 2, 3};  // Last two elements are zero.
```
If you declare the array like
```
int arr[][] = { {1, 2, 3}, {1, 3, 5, 7} };
```
What should the compiler do, report the error, since it expects the array int array[][3], or make the array int array[][4]? This decision is left to be up to a human.

The 2-D arrays are flat in the memory. For example int arr[3][3] and int arr[9] have the same storage. Thus, it is allowed to initialize both vectors by one initializing lists, it can be considered as placing initializing list values directly to a flat memory:

#include <stdio.h>
#include <memory.h>

int main(void) {
  int arr2d[3][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8};
  int arr1d[9] = {0, 1, 2, 3, 4, 5, 6, 7, 8};
  printf("size: %d\n", sizeof(arr2d) == sizeof(arr1d));
  printf("memcmp: %d\n", memcmp(arr2d, arr1d, sizeof(arr2d)));
  return 0;
}
// size: 1
// memcmp: 0

Extending the above, all 4 functions bellow declare and initialize the same array int arr[3][3] differently:

#include <stdio.h>

void arr2_list2(void) {
  int arr[3][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}};
  printf("int arr[3][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}:\n");
  for (int i = 0; i < 3; ++i)
    printf("%d %d %d\n", arr[i][0], arr[i][1], arr[i][2]);
  return 0;
}

void arr2_list1(void) {
  int arr[3][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8};
  printf("int arr[3][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8}:\n");
  for (int i = 0; i < 3; ++i)
    printf("%d %d %d\n", arr[i][0], arr[i][1], arr[i][2]);
  return 0;
}

void arr2open_list2(void) {
  int arr[][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}};
  printf("int arr[][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}:\n");
  for (int i = 0; i < 3; ++i)
    printf("%d %d %d\n", arr[i][0], arr[i][1], arr[i][2]);
  return 0;
}

void arr2open_list1(void) {
  int arr[][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8};
  printf("int arr[][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8}:\n");
  for (int i = 0; i < 3; ++i)
    printf("%d %d %d\n", arr[i][0], arr[i][1], arr[i][2]);
  return 0;
}

int main(void) {
  arr2_list2();
  arr2_list1();
  arr2open_list2();
  arr2open_list1();
  return 0;
}
// int arr[3][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}:
// 0 1 2
// 3 4 5
// 6 7 8
// int arr[3][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8}:
// 0 1 2
// 3 4 5
// 6 7 8
// int arr[][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}:
// 0 1 2
// 3 4 5
// 6 7 8
// int arr[][3] = {0, 1, 2, 3, 4, 5, 6, 7, 8}:
// 0 1 2
// 3 4 5
// 6 7 8

Imaging int arr[][] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}; is allowed, why would not int arr[][] = {0, 1, 2, 3, 4, 5, 6, 7, 8}; be allowed then, but how can the compiler decide what a human means in the statement

int arr[][] = {0, 1, 2, 3, 4, 5, 6, 7, 8};

int arr[1][9], or int arr[3][3], or int arr[9][1]?

Support Ukraine · Accepted Answer · 2021-08-07T17:27:32.433

4

What does prevent a compiler from deducing the size of column?

The C standard.

From the standard:

6.7.9 Initialization

The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.

So you can initialize "an array of unknown size" but you can't initialize "an array of unknown size of array of unknown size".

edited Aug 07 '21 at 17:27

answered Aug 07 '21 at 16:45

Support Ukraine

42,271
4
38
63

True :), but why can't standard have this definition? – Sreeraj Chundayil Aug 07 '21 at 17:07
3

@InQusitive It could but it doesn't. What prevents a compiler from making `1/2` equal `0.5` ? Again the C standard. It's as easy as that. You can find 1000 things that C compilers **could** do but if it violates the standard, compilers won't. – Support Ukraine Aug 07 '21 at 17:09

score 1 · Answer 3 · answered Aug 07 '21 at 18:24

Simple answer - in a declaration like

int arr[][] = { ... };

the element type is int [], which is an incomplete type, and you cannot declare an array where the element type is incomplete. The presence or absence of an initializer doesn’t change that. All an initializer can tell you is how many elements you have of the given element type; it can’t tell you what that element type is.

By contrast, in the declaration

int arr[] = { ... };

the element type is int, which is a complete type. You still need something in the initializer to determine the number of elements, but it’s not telling you how big each element needs to be.

score -1 · Answer 4 · answered Aug 07 '21 at 16:48

That's because although the [][] syntax looks like array-of-arrays, in memory it is laid like 1-D array. If we imagine a 3x3 chessboard it would be laid out as {1A, 2A, 3A, 1B, 2B, 3B, 1C, 2C, 3C}. The only way to access it as 2-D is to address it as [row + column * columnsize].

Now it's clear that you MUST know the columnsize in order to even start thinking about accessing the elements of the array. Where is this columnsize parameter stored? The array is of type int (*)[3]. Columnsize is an integral part of the type as much as "t" is a part of int.

What are you asking for is auto type. The compiler would have to determine the entire type on its own. It's equivalent of wanting auto arr = 1; or auto arr = "foo";. C compiler doesn't deal in such tricks.

Notice that columnsize is entirely different creature than the length of an array. int arr[] = {1} is of exactly same type as int arr[] = {1,2,3} The compiler couldn't care less about the length (actual size in memory) of the array, not going out of bounds is entirely your responsibility. That's why one element could be left unspecified. Not because the compiler can figure it out, but because the compiler ignores it.

int arr[][2] = {1,2,3,4,5,6} is of entirely different type than int arr[][3] = {1,2,3,4,5,6}, even though they're same size in memory. To the compiler, they're like int[1] and char[4].

The {{},{}} is just a syntactic sugar to help YOU, not the compiler.

What does prevent a compiler from deducing the size of column?

4 Answers4