7

Why can't you declare a 2D array argument in a function as you do with a normal array?

 void F(int bar[]){} //Ok
 void Fo(int bar[][]) //Not ok
 void Foo(int bar[][SIZE]) //Ok

Why is it needed to declare the size for the column?

Carlj901
  • 1,329
  • 4
  • 24
  • 39

6 Answers6

19

Static Arrays:

You seem not to have got the point completely. I thought to try to explain it somewhat. As some of the above answers describe, a 2D Array in C++ is stored in memory as a 1D Array.

int arr[3][4] ;   //consider numbers starting from zero are stored in it

Looks somewhat like this in Memory.

1000    //ignore this for some moments       1011  
^                                             ^
^                                             ^

0   1   2   3   4   5   6   7   8   9   10   11
|------------|  |-----------|   |-------------|
  First Array    Second Array     Third Array

|----------------------------------------------|    
                Larger 2D Array

Consider that here, the Bigger 2D Array is stored as contiguous memory units. It consists of total 12 elements, from 0 to 11. Rows are 3 and columns are 4. If you want to access the third array, you need to skip the whole first and second arrays. That is, you need to skip elements equal to the number of your cols multiplied by how many arrays you want skip. It comes out to be cols * 2.

Now when you specify the dimensions to access any single index of the array, you need to tell the compiler beforehand exactly how much elements to skip. So you give it the exact number of cols to perform the rest of the calculation.

So how does it perform the calculation? Let us say it works on the column major order, that is, it needs to know the number of columns to skip. When you specify one element of this array as...

arr[i][j] ;

Compiler performs this calculation automatically.

Base Address + (i * cols + j) ;

Let us try the formula for one index to test its veracity. We want to access the 3rd element of the 2nd Array. We would do it like this...

arr[1][2] ;   //access third element of second array

We put it in the formula...

  1000 + ( 1 * 4 + 2 )
= 1000 + ( 6 )
= 1006   //destination address

And we reach at the address 1006 where 6 is located. In a nutshell, we need to tell the compiler the number of cols for this calculation. So we send it as a parameter in a function.

If we are working on a 3D Array, like this...

int arr[ROWS][COLS][HEIGHT] ;

We would have to send it the last two dimensions of the array in a function.

void myFunction (int arr[][COLS][HEIGHT]) ;

The formula now would become this..

Base Address + ( (i * cols * height) + (j * height) + k )  ;

To access it like this...

arr[i][j][k] ;

COLS tell the compiler to skip the number of 2D Array, and HEIGHT tells it to skip the number of 1D Arrays. And so on and so forth for any dimension.

Dynamic Arrays:

As you ask about different behavior in case of dynamic arrays which are declared thus..

int ** arr ;

Compiler treats them differently, because each index of a Dynamic 2D Array consists of an address to another 1D Array. They may or may not be present on contiguous locations on heap. Their elements are accessed by their respective pointers. The dynamic counterpart of our static array above would look somewhat like this.

1000  //2D Pointer
^
^
2000       2001     2002
^          ^        ^
^          ^        ^
0          4        8
1          5        9
2          6        10
3          7        11

1st ptr  2nd ptr   3rd ptr

Suppose this is the situation. Here the 2D Pointer or Array on the location 1000. It hold the address to 2000 which itself holds address of a memory location. Here pointer arithmetic is done by the compiler by virtue of which it judges the correct location of an element.

To allocate memory to 2D Pointer, we do it..

arr = new int *[3] ;

And to allocate memory to each of its index pointer, this way..

for (auto i = 0 ; i < 3 ; ++i)
  arr[i] = new int [4] ;

At the end, each ptr of the 2D Array is itself an array. To access an element you do...

arr[i][j] ;

Compiler does this...

*( *(arr + i) + j ) ;
   |---------|
     1st step
|------------------|
      2nd step

In the first step, the 2D Array gets dereferenced to its appropriate 1D Array and in the second step, the 1D Array gets dereferenced to reach at the appropriate index. That is the reason why Dynamic 2D Arrays are sent to the function without any mention of their row or column.

Note: Many details have been ignored and many things supposed in the description, especially the memory mapping just to give you an idea.

Coding Mash
  • 3,338
  • 5
  • 24
  • 45
  • Explained well... "Let us say it works on the column major order". What does this mean 'col major order' – Sasha Sep 29 '12 at 14:29
  • It means compiler requires cols reach an index of the 2d array. In row major order, it needs number of rows. – Coding Mash Sep 29 '12 at 14:36
  • What would become the formula then? – Sasha Sep 29 '12 at 15:11
  • Technically `i*cols + j` indexing is row-major order. https://stackoverflow.com/questions/5991837/row-major-order-indices. Row-major and column-major refer to how to access memory in a 1D linearized fashion. Here row-major order refers to the memory being unrolled on a per-row basis. Column-major is when the memory is unrolled on a per-column basis. For completeness, column-major linear indexing would be `j*rows + i`. Though you do mention what column-major order is in your post, it may confuse it from the actual definition. I'd recommend editing your post, but otherwise a good answer. – rayryeng Aug 24 '18 at 12:25
6

You can't write void Foo(int bar[][]), because bar decays to a pointer. Imagine following code:

void Foo(int bar[][]) // pseudocode
{
    bar++; // compiler can't know by how much increase the pointer
    // as it doesn't know size of *bar
}

So, compiler must know size of *bar, therefore size of rightmost array must be provided.

Griwes
  • 8,805
  • 2
  • 43
  • 70
  • If it decays into a pointer, shouldn't it be possible to write void Foo(int *bar) and pass a 2D-array to that function? – Carlj901 Sep 29 '12 at 12:55
  • 1
    It is, although you have to do the arithmetic yourself. – Puppy Sep 29 '12 at 13:03
  • @Carlj901, n-dimensional array decays to pointer to (n-1)-dimensional array, not (n-x)-dimensional array. And what you probably just need is vector of vectors... – Griwes Sep 29 '12 at 13:11
1

Because when you pass an array, it decays to a pointer, so excluding the outer-most dimension is ok and that's the only dimension you can exclude.

 void Foo(int bar[][SIZE]) 

is equivalent to:

 void Foo(int (*bar)[SIZE]) 
P.P
  • 117,907
  • 20
  • 175
  • 238
  • I still don't understand _why_ you can't write void Foo(int bar[][]) – Carlj901 Sep 29 '12 at 12:20
  • Since array is just a pointer to the first element when passed to a function, you need to tell the compiler about the size except the outer-most dimension. Or you can pass it like: `void Foo(int **bar)`. – P.P Sep 29 '12 at 12:23
  • You wont be able to pass a true 2D array into a `void Foo(int **bar)`. – fredoverflow Sep 29 '12 at 17:03
1

The compiler needs to know how long the second dimension is to calculate the offsets. A 2D array is in fact stored as a 1D array. If you want to send an array with no known dimensions, consider using pointer to pointers and some sort of way to know the dimension yourself.

This is different from e.g. java, because in java the datatype also contains the dimension.

hamon
  • 1,006
  • 6
  • 7
  • Good point, but one may ask, why does int** work? No size is specified with that notation. – Lews Therin Sep 29 '12 at 12:32
  • because int[N][M] is different from int**. int[][] is a static allocated array and so it is in fact a 1D array with dimension N*M. and int** is in fact an array of arrays. So the first array contains pointers to the second. The static allocated doesn't contain those pointers and if you try to cast from int[N][M] int** you will get wierd results. – hamon Sep 29 '12 at 12:48
1

Since static 2D arrays are like 1D arrays with some sugar to better access data, you have to think about the arithmetic of pointers.
When the compiler tries to access element array[x][y], it has to calculate the address memory of the element, that is array+x*NUM_COLS+y. So it needs to know the length of a row (how many elements it contains).
If you need more information I suggest this link.

Simone-Cu
  • 1,109
  • 8
  • 21
  • I think it needs to know length of the column. By knowing the length it can infer how much row there is. But why does int** work? You specify no size. – Lews Therin Sep 29 '12 at 12:35
  • 2D arrays are located in memory by rows. And the length of the row is the number of columns (you can think it like a matrix). PS:the arrays is declared as array[NUM_ROWS][NUM_COLS]. – Simone-Cu Sep 29 '12 at 12:40
  • Um, I think not. 2D array is actually an array of 1d arrays. We use the matrix concept as an abstraction. I believe there are no *rows* as we know it. – Lews Therin Sep 29 '12 at 12:42
  • Ok, maybe there is a misunderstanding and it can be my fault (forgive my bad English). I agree that static 2D arrays are like 1D arrays as said in my answer. And int** can't work cause static 2D arrays are not pointer of pointers. The matrix vision is to understand why the compiler needs the NUM_COLS: to convert the abstraction of 2D arrays (array[x][y]) in real memory address. – Simone-Cu Sep 29 '12 at 12:47
  • @DeadMG Yes, but if you create a structure like this: `int** M=malloc(NUM_ROWS*sizeof(int*)); for (i=0;i – Simone-Cu Sep 29 '12 at 13:43
1

there are basically three ways to allocate a 2d array in C/C++

allocate on heap as a 2d array

you can allocate a 2d array on the heap using malloc such as:

const int row = 5;
const int col = 10;
int **bar = (int**)malloc(row * sizeof(int*));
for (size_t i = 0; i < row; ++i)
{
    bar[i] = (int*)malloc(col * sizeof(int));
}

this is actually stored as an array of arrays therefore isn't necessarily contiguous in memory. note that this also means there will be a pointer for each array costing yout extra memory usage (5 pointers in this example, 10 pointers if you allocate it the other way around). you can pass this array to a function with the signature:

void foo(int **baz)

allocate on heap as 1d array

for various reasons (cache optimizations, memory usage etc.) it may be desirable to store the 2d array as a 1d array:

const int row = 5;
const int col = 10;
int *bar = (int*)malloc(row * col * sizeof(int));

knowing second dimension you can access the elements using:

bar[1 + 2 * col]  // corresponds semantically to bar[2][1]

some people use preprocessor magic (or method overloading of () in C++) to handle this automatically such as:

#define BAR(i,j) bar[(j) + (i) * col]
..
BAR(2,1) // is actually bar[1 + 2 * col]

you need to have the function signature:

void foo(int *baz)

in order to pass this array to a function.

allocate on stack

you can allocate a 2d array on stack using something like:

int bar[5][10];

this is allocated as a 1d array on the stack therefore compiler needs to know the second dimension to reach the element you need just like we did in the second example, therefore the following is also true:

bar[2][1] == (*bar)[1 + 2 * 10]

function signature for this array should be:

void foo(int baz[][10])

you need to provide the second dimension so that compiler would know where to reach in memory. you don't have to give the first dimension since C/C++ is not a safe language in this respect.

let me know if I mixed up rows and columns somewhere..

none
  • 11,793
  • 9
  • 51
  • 87