Why we don't need number of column when passing the dynamic 2d array?

Question

Lets say that I have two arrays and I am passing them to a function :

void func(int arr1[][4], int **arr2) { // <- I need to give n in only one, why?
...
}
int main() {
    int n = 5, m = 4;
    int arr1[n][m];
    int **arr2 = (int**)malloc(n * sizeof(int*));
    for(int i = 0;i < n;i++)
        arr2[i] = (int*)malloc(m * sizeof(int));
    func(arr1, arr2);
    return 0;
}

Why can't we treat both the array passing in a similar way?

Edit : There was an error in the code.

`arr2` isn't an array, but a pointer, in fact a pointer to a pointer. — alk, Jun 15 '17 at 06:34
In either case the function needs to know the size of the pointed at data somehow. There's no way around that no matter syntax. So ideally you would write something like `void func(size_t x, size_t y, int arr[x][y])`. — Lundin, Jun 15 '17 at 06:46
Also see [Correctly allocating multi-dimensional arrays](https://stackoverflow.com/questions/42094465/correctly-allocating-multi-dimensional-arrays) to unlearn your misunderstanding about dynamic 2d arrays. — Lundin, Jun 15 '17 at 06:48
Note that `arr1` in `main` can't be passed to `func`; it expects an `int arr1[][5]` and yet you have, in effect, `int arr1[5][4];` in `main()` (except it is a variably qualified array, not a regular array of fixed size). A world of pain awaits you. — Jonathan Leffler, Jun 15 '17 at 06:50
Also, your code is strictly C code (C99 or C11 code). C++ compilers are not required to allow your array in `main()`. If `n` and `m` were const-qualified, the issue would be different. And GCC/G++ allows variably qualified arrays unless you specify `-pedantic` but there are other C++ compilers in this world. — Jonathan Leffler, Jun 15 '17 at 06:53

Andre Kampling · Accepted Answer · 2017-06-15T07:35:14.650

2

Acually the opposite of what you're saying is the case: You don't have to pass the number of rows. Assume that array indices work like this:

int arr[MAX_ROW][MAX_COL]; /* with both 3 */

           col
     --------------->
    | 0,0   0,1   0,2
row | 1,0   1,1   1,2
    V 2,0   2,1   2,2

When you pass int arr[][MAX_COL] the compiler know where the next row will begin when you address like arr[row][col] for example.

If you would do that manually with a pointer it would look like: &arr[0][0] + row * MAX_COL + col. In that example you also have to know the column size MAX_COL of the array to calculate the next row.

The reason for this is, that an array is continuous in memory. The above array is represented in memory like:

|     row = 0     |     row = 1     |     row = 2     |
| 0,0   0,1   0,2 | 1,0   1,1   1,2 | 2,0   2,1   2,2 |

The compiler does also have to know the row offset because when you pass an array declared as int arr[MAX_SIZE] to the function void foo (int arr[]), it decays into a pointer to the beginning of the array int* arr. In case of arrays of arrays (2D arrays), it decays to a pointer to its first element as well, which is a pointer to a single array int (*arr)[MAX_COL].

In short: With int arr[][MAX_COL] the compiler have all informations needed to address the array with arr[row][col].

edited Jun 15 '17 at 07:35

answered Jun 15 '17 at 06:28

Andre Kampling

5,476
2
20
47

I understood your answer and it certainly helped. A better in short if I may suggest so, will be, "In simpler way, int **arr2 is a 2-d grid of ints formed using pointers while int arr1[][] is a 1-d datastruct of ints with the information of where the row ends. " If I understood it wrong, then please correct me. – as2d3 Jun 15 '17 at 07:50
@AbhishekAgrawal: Don't know what you mean by 2-d grid and 1-d datastruct. Actually all the arrays, 1D, 2D, 3D, etc. are stored continuous in memory. – Andre Kampling Jun 15 '17 at 08:02
Is pointer to pointer continuous, because each row is allocated using malloc in each iteration, so there should be a jump between the address of last element of first row and first element of second row ? – as2d3 Jun 15 '17 at 10:53
Yes you are right! Remember: **`int** arr != int arr[row][col]`**. A pointer to pointer is not a fixed size 2D array [see here](https://stackoverflow.com/questions/8203700/conversion-of-2d-array-to-pointer-to-pointer). A fixed size 2D array is as already said equivalent to `int (*arr)[MAX_COL]` which is a pointer to a single array. A pointer to a pointer `int** arr` is not garantueed to be continuous as you said it's dynamically allocated. – Andre Kampling Jun 15 '17 at 10:58

Sourav Ghosh · Answer 2 · 2017-06-15T06:47:05.020

0

Actually it's the opposite, you can omit only one of the index (in case of a multi-dimensional array), the innermost one.

This is because, arrays, while passed as function arguments, decays to a pointer to the first element. Quoting C11, chapter §6.3.2.1

Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. [...]

thus, a notation like

 void func(int arr1[][5], int **arr2)    //an array of array of 5 ints

and

 void func(int (*arr1) [5], int **arr2)  //pointer to the starting element of an array of 5 ints

are equivalent.

edited Jun 15 '17 at 06:47

answered Jun 15 '17 at 06:23

Sourav Ghosh

133,132
16
183
261

Please explain this a bit more : 'decays to a pointer to the first element'. – as2d3 Jun 15 '17 at 06:26
Isn't it called the *outermost* one (the omittable)? I belived it was referred to the memory layout, not the position in code between tokens. – Bob__ Jun 15 '17 at 06:27
1

@AbhishekAgrawal Actually a function array *parameter* is "adjusted" to a pointer to the type of element of the array. So `int[42`] is *adjusted* to `int*`. Array *decay* is what happens when you call such a function with an array as *argument*. And a "2D array" is an array of arrays. – juanchopanza Jun 15 '17 at 06:28
@cmaster Thanks, that was a typo. :) – Sourav Ghosh Jun 15 '17 at 06:46

Stephan Lechner · Answer 3 · 2017-06-15T07:03:06.133

You actually have only one array of ints (i.e. int arr1[][5]) and one pointer to a pointer of an int, i.e. int **arr2. Even if an array like arr1[10][5], when passed as argument to a function, decays to a pointer to the beginning of the memory where the elements reside, there is a (big) difference in memory layout and in the way how a compiler treats access to these pointers.

BTW, in main it should be int n=5,m=4;int arr1[m][n], not int n=5,m=4;int arr1[n][m].

Concerning memory layout:

A 2D integer array of the form int [10][5] is represented as 10 consecutive "rows", each comprising 5 "columns" (i.e. integral values). The size of this array is 10 * 5 * sizeof(int), and the size of one "row" is 5 * sizeof(int).

A pointer to a pointer to int int **p is just a single pointer; its size is sizeof(int**), even if you have "malloced" a sequence of integral pointers lile p = malloc(10 * sizeof (int*)); Note "*" in sizeof(int *), as you create a sequence of pointers to integers, not a sequence of integers. That's the main difference in memory layout: It's not a 2D array of integers, but a 1D array of pointers to integers. And if one actually had allocated 10 "rows" for "10" integers, each row could be located in a different portion of memory. The space needed for managing such a (spreaded) amount of 10x5 integral values is "10*sizeof(int*) + 10*5*sizeof(int)".

Concerning access:

Let's assume a variable of type int arr[][5], which is a 2D-array of integers, where the size of a column is 5 and the number of rows is not determined. Informally, an access like int x = arr[3][4] is translated into an access the (3*5 + 4)th element of the array, i.e. "row times rowsize plus column"; Note that - based on this formula - the compiler does not need to know how many rows the the array actually has.

In contrast, let's assume a variable of type int **p. You can think of an access like x = p[3][4] as being equivalent to int *r = p[3]; int x = r[4]; Note that r is of type int *, i.e. it is a pointer, and r[4] then dereferences this pointer and returns an integral value.

This is rather informally described. Yet the main issue is that the memory layout of an arr[][5] contains consecutive integral values only, whereas int **arrr may be a seqence of pointers (or even just one such pointer), each of them probably pointing to a sequence of integral values (or just one integral value).

Why we don't need number of column when passing the dynamic 2d array?

3 Answers3