In C, does the meaning of A[i][j] depend on how A is declared?

Question

Suppose I have a two-dimensional array grid declared as double grid[5][5]. It is my understanding that the following statements are true:

when grid is declared, a contiguous block of memory is allocated for 5*5 doubles, no more no less;
when an element of the array is accessed with the notation grid[i][j], this piece of code is actually interpreted as *(grid+(i*5+j)).

On the other hand, I know I can also store the same matrix as an array of pointers, by writing something like:

double ** grid;
grid = (double**)malloc(sizeof(double*)*5);
for (i=0; i<5; i++)
   grid[i] = (double*)malloc(sizeof(double)*5);

and I actually have a code which does that. The problem is that it then proceeds to access the elements of grid just like before, with the double subscript notation. Is this case different? In this case, is grid[i][j] converted to *(*(grid+i)+j), i.e. to a double dereferenciation? That's the only way I can see it happen correctly.

(This question probably stems from my (lack of) understanding of the relationship between pointer and array types in C...)

EDIT:

Ok, let's see if I got this straight:

grid[i][j] is always converted to *(*(grid+i)+j);
this expression is indeed calculated differently in the two cases, because, as Jim states in his answer, pointer arithmetic takes into account the size of the type pointed to; nevertheless, the correct element is fetched in both cases;
if and only if "grid" is a 2D array, this expression is further optimized to *( (double*)grid + (i*5+j) ), which is possible because the compiler knows that any grid[i] is actually an array starting at location grid+i*5.

But this leaves me with an inescapable conclusion: for a 2D array, if I set i=j=0, then I have that **grid == *((double*)grid). Is this correct?

@haccks Are you _really sure_? What do you think would be the correct equivalent to `grid[i][j]`?! — stefan, Mar 27 '14 at 18:23
@haccks ok, thanks. I'll remove the comment. `*(grid+i)` is the `i`th pointer in the array of pointers. Then dereference that pointer offset by `j` to get the double at `[i][j]`. So his interpretation is correct. — user3386109, Mar 27 '14 at 18:25
@haccks: I believe pointer arythmetics automatically takes into account the size of the pointed type, so you should not specify a `sizeof` but only the number of elements you wish to offset by. I'm pretty sure about that. — sp00n, Mar 27 '14 at 18:34
@haccks I don't get what you mean, I'm not really a C programmer. Do you mean something like line 15 of this: http://ideone.com/5yiv5d ? Because that would be wrong. — stefan, Mar 27 '14 at 18:34
@stefan; No no no. That was not what I meant. I meant that how the expression `grid[i][j]` is unwrapped by compiler to find the address and then defreference that address. I am sorry if you misinterpret my intention. — haccks, Mar 27 '14 at 18:38
@haccks Well manual unwrapping should result in the same as the compiler does. Just write the full version please. — stefan, Mar 27 '14 at 18:40
@stefan; OK. If I would ask you that why in `int *p, a[10]; p = a; p++`, `p` is incremented to `a[1]` then you will simply answer that because the arithmetic is done as `p + 1*sizeof(int)` instead of `p + 1`. But we do not need to specify this `sizeof`. Compiler do this implicitly. — haccks, Mar 27 '14 at 18:44
@haccks Ok, so you actually agree that your "equivalent" statement isn't equivalent at all. You're still lacking a dereference though. — stefan, Mar 27 '14 at 18:47
@stefan; Where I am lacking a dereference ? Are you talking about `grid[5][5]` or `double ** grid` ? — haccks, Mar 27 '14 at 18:53
@user3386109 Since I added a few statements, your first comment here might be misunderstood by a future reader. Would you like to edit it to specify which part of the question you were referring to? (I hope that it applies to the last part as well, of course... ;) ) — sp00n, Apr 11 '14 at 18:56
@sp00n I removed the comment (edits are only allowed for a limited time). I'll review later, and let you know if I see anything interesting. — user3386109, Apr 12 '14 at 00:20

score 6 · Answer 1 · answered Mar 27 '14 at 22:49

6

when an element of the array is accessed woth the notation grid[i][j], this piece of code is actually interpreted as *(grid+(i*5+j)).

No. (Not sure why so many answers said yes to this). *(grid+(i*5+j)) is the same as grid[i*5+j], which is an out-of-bounds access for some values of i and j. Also, this specifies an array, not an int.

The following two expressions are exactly equivalent in all cases: A[i][j] *(*(grid+i)+j).

You never "gain" anything by converting between the two different dereferencing notations. It is just two equivalent forms of syntax for an lvalue expression which designates a particular object.

For the rest of my answer I will use the [ ] syntax as I find it clearer.

Perhaps you meant to ask something like "with int A[5][5];, then is A[i][j] equivalent to offsetting by i+5*j from the start of A?"

The other answers muddle around a bit because the term "offsetting by N" is ambiguous. N bytes, or N ints, or N arrays of int?

If you imagine in your head that A were a 1-D array of length 25 (let's call this B), then A[i][j] designates the same object as B[i*5+j].

To express this in code, you would write: int *B = (int *)&A. This aliases the 2-D array of int as a 1-D array.

NB. It would be wrong to write int *B = (int *)A, because A decays to &A[0] which only has five ints in it, so B[6] is still an out-of-bounds access (undefined behaviour). If you've not got bounds-checking turned on in your compiler it's likely you'll not notice anything though.

answered Mar 27 '14 at 22:49

M.M

138,810
21
208
365

We said yes to them being "interpred as" because `*(grid+(i*5+j))` is effectively what gets generated at the assembly level. You're right that at the _type_ level, it is incorrect. +1. – Mooing Duck Mar 28 '14 at 20:40
1

@MooingDuck Matt is exactly right -- the answer to the question is No". `*(grid+(i*5+j)` is C syntax, it is not equivalent to `grid[i][j]`, and it is not what gets generated in asm. What is equivalent is `*((double*)grid + i*5+j)` which is equivalent to `*(double*)((char*)grid + sizeof(double)*(i*5+j))` which resembles what gets generated in asm. – Jim Balter Mar 28 '14 at 22:28
1

"The other answers muddle around a bit" -- My answer doesn't, because it refers to `sizeof`, which is necessarily a byte offset. – Jim Balter Mar 28 '14 at 22:32
@Matt: thanks. But I don't undestand the second part of your answer. I see why `int* B = (int*)A; B[6]` may be an out-of-bounds access, but how can `(int*)&A` be correct, if A itself decays into a pointer to A[0]? Wouldn't `&A` become `&(&A[0])` i.e. a pointer to a pointer to an array? – sp00n Mar 30 '14 at 12:08
2

`A` does not decay when it is the operand of `&` or `sizeof`. – M.M Mar 30 '14 at 20:06
Oh, I see... So `&A` has the same value as `A` but a different type! Thanks. – sp00n Apr 11 '14 at 18:53

score 5 · Answer 2 · answered Mar 27 '14 at 19:25

x[i][j] is always exactly equivalent to *(*(x+i)+j) ... but you have to keep in mind that pointer arithmetic takes into account the size of the type pointed to. In

double ary[NROWS][NCOLS];

the size of ary[i] (i.e., *(ary+i)) is NCOLS*sizeof(double). In

double** ptr;

the size of ptr[i] (i.e., *(ptr+i)) is sizeof(double*).

In both cases, ary[i][j] or ptr[i][j], the correct element is fetched.

score 1 · Answer 3 · edited May 23 '17 at 10:31

Yes. You are going right way but the declaration

double grid[5][5]

and

double **grid;

are different. First is declaring a 2D array of 25 elements of double type while second is declaring a pointer to pointer to a double type. Both are different. Note that arrays are not pointers.

In first case memory allocation is on stack and it is continuous and hence compiler optimizes grid[i][j] to *(*(grid + i) + j) which is further optimized to *(*grid + (i*5 + j)).

In second case memory is allocated on heap and malloc won't create contiguous memory. In this case compiler optimizes grid[i][j] to *(*(grid + i) + j) but it doesn't further optimized it to *(*grid + (i*5 + j)).

Thanks for your answer. Did you mean `*(grid + (i*5 + j))` (without the second asterisk)? — sp00n, Mar 30 '14 at 11:38

score 0 · Answer 4 · edited May 23 '17 at 11:55

Yes, the address A[i][j] refers to does depend on how A is declared.

The pointer arithmetic used to evaluate array_variable[i][j] depends on the number of columns (on the second dimension): This is because all the elements for each row (first dimension) are stored together, and the number of columns impacts where the data for the second, third, ETC. row is stored.

This URL states that if you have a 2-dimensional array, declared as:

int [NUMBER_OF_ROWS][NUMBER_OF_COLUMNS]

Then:

x = y[a][b]

is equivalent to:

x = *((int *)y + a * NUMBER_OF_COLUMNS + b);

URL: How to use pointer expressions to access elements of a two-dimensional array in C?

This URL states that elements of (a 2 dimensional array) are stored in row-major order: E.G. all the elements of row 1 are stored first.

http://www.cse.msu.edu/~cse251/lecture11.pdf

This URL has a similar calculation, dependent on the size of the second dimension:

http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/pointer.html

addr( & arr[ row ][ col ] ) = addr( arr ) + [ sizeof( int ) * COLS * row ]
                                          + [ sizeof( int ) * col ]

score 0 · Answer 5 · answered Mar 27 '14 at 18:29

Additionally to other answers:

when grid is declared, a contiguous block of memory is allocated for 5*5 doubles, no more no less;

It's not totally true (the part with less will always be true), but sometimes you will find, that even though you are accessing memory after the end of an array the program seems to work (does not segfault instantly). This means that sometimes malloc can allocate more memory than you are asking. Of course it does not mean, that you should access this excessive memory.