14

Suppose I have a two-dimensional array grid declared as double grid[5][5]. It is my understanding that the following statements are true:

  • when grid is declared, a contiguous block of memory is allocated for 5*5 doubles, no more no less;
  • when an element of the array is accessed with the notation grid[i][j], this piece of code is actually interpreted as *(grid+(i*5+j)).

On the other hand, I know I can also store the same matrix as an array of pointers, by writing something like:

double ** grid;
grid = (double**)malloc(sizeof(double*)*5);
for (i=0; i<5; i++)
   grid[i] = (double*)malloc(sizeof(double)*5);

and I actually have a code which does that. The problem is that it then proceeds to access the elements of grid just like before, with the double subscript notation. Is this case different? In this case, is grid[i][j] converted to *(*(grid+i)+j), i.e. to a double dereferenciation? That's the only way I can see it happen correctly.

(This question probably stems from my (lack of) understanding of the relationship between pointer and array types in C...)

EDIT:

Ok, let's see if I got this straight:

  • grid[i][j] is always converted to *(*(grid+i)+j);
  • this expression is indeed calculated differently in the two cases, because, as Jim states in his answer, pointer arithmetic takes into account the size of the type pointed to; nevertheless, the correct element is fetched in both cases;
  • if and only if "grid" is a 2D array, this expression is further optimized to *( (double*)grid + (i*5+j) ), which is possible because the compiler knows that any grid[i] is actually an array starting at location grid+i*5.

But this leaves me with an inescapable conclusion: for a 2D array, if I set i=j=0, then I have that **grid == *((double*)grid). Is this correct?

sp00n
  • 181
  • 1
  • 11
  • @haccks Are you _really sure_? What do you think would be the correct equivalent to `grid[i][j]`?! – stefan Mar 27 '14 at 18:23
  • 1
    @haccks ok, thanks. I'll remove the comment. `*(grid+i)` is the `i`th pointer in the array of pointers. Then dereference that pointer offset by `j` to get the double at `[i][j]`. So his interpretation is correct. – user3386109 Mar 27 '14 at 18:25
  • @haccks I doubt it, but what's `row`? – stefan Mar 27 '14 at 18:31
  • @haccks: I believe pointer arythmetics automatically takes into account the size of the pointed type, so you should not specify a `sizeof` but only the number of elements you wish to offset by. I'm pretty sure about that. – sp00n Mar 27 '14 at 18:34
  • @haccks I don't get what you mean, I'm not really a C programmer. Do you mean something like line 15 of this: http://ideone.com/5yiv5d ? Because that would be wrong. – stefan Mar 27 '14 at 18:34
  • @stefan; No no no. That was not what I meant. I meant that how the expression `grid[i][j]` is unwrapped by compiler to find the address and then defreference that address. I am sorry if you misinterpret my intention. – haccks Mar 27 '14 at 18:38
  • @haccks Well manual unwrapping should result in the same as the compiler does. Just write the full version please. – stefan Mar 27 '14 at 18:40
  • @stefan; OK. If I would ask you that why in `int *p, a[10]; p = a; p++`, `p` is incremented to `a[1]` then you will simply answer that because the arithmetic is done as `p + 1*sizeof(int)` instead of `p + 1`. But we do not need to specify this `sizeof`. Compiler do this implicitly. – haccks Mar 27 '14 at 18:44
  • @haccks Ok, so you actually agree that your "equivalent" statement isn't equivalent at all. You're still lacking a dereference though. – stefan Mar 27 '14 at 18:47
  • @stefan; Where I am lacking a dereference ? Are you talking about `grid[5][5]` or `double ** grid` ? – haccks Mar 27 '14 at 18:53
  • @haccks the dynamic one: `double** grid`.. – stefan Mar 27 '14 at 18:54
  • @user3386109 Since I added a few statements, your first comment here might be misunderstood by a future reader. Would you like to edit it to specify which part of the question you were referring to? (I hope that it applies to the last part as well, of course... ;) ) – sp00n Apr 11 '14 at 18:56
  • @sp00n I removed the comment (edits are only allowed for a limited time). I'll review later, and let you know if I see anything interesting. – user3386109 Apr 12 '14 at 00:20

6 Answers6

6

when an element of the array is accessed woth the notation grid[i][j], this piece of code is actually interpreted as *(grid+(i*5+j)).

No. (Not sure why so many answers said yes to this). *(grid+(i*5+j)) is the same as grid[i*5+j], which is an out-of-bounds access for some values of i and j. Also, this specifies an array, not an int.

The following two expressions are exactly equivalent in all cases: A[i][j] *(*(grid+i)+j).

You never "gain" anything by converting between the two different dereferencing notations. It is just two equivalent forms of syntax for an lvalue expression which designates a particular object.

For the rest of my answer I will use the [ ] syntax as I find it clearer.

Perhaps you meant to ask something like "with int A[5][5];, then is A[i][j] equivalent to offsetting by i+5*j from the start of A?"

The other answers muddle around a bit because the term "offsetting by N" is ambiguous. N bytes, or N ints, or N arrays of int?

If you imagine in your head that A were a 1-D array of length 25 (let's call this B), then A[i][j] designates the same object as B[i*5+j].

To express this in code, you would write: int *B = (int *)&A. This aliases the 2-D array of int as a 1-D array.

NB. It would be wrong to write int *B = (int *)A, because A decays to &A[0] which only has five ints in it, so B[6] is still an out-of-bounds access (undefined behaviour). If you've not got bounds-checking turned on in your compiler it's likely you'll not notice anything though.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • We said yes to them being "interpred as" because `*(grid+(i*5+j))` is effectively what gets generated at the assembly level. You're right that at the _type_ level, it is incorrect. +1. – Mooing Duck Mar 28 '14 at 20:40
  • 1
    @MooingDuck Matt is exactly right -- the answer to the question is No". `*(grid+(i*5+j)` is C syntax, it is not equivalent to `grid[i][j]`, and it is not what gets generated in asm. What is equivalent is `*((double*)grid + i*5+j)` which is equivalent to `*(double*)((char*)grid + sizeof(double)*(i*5+j))` which resembles what gets generated in asm. – Jim Balter Mar 28 '14 at 22:28
  • 1
    "The other answers muddle around a bit" -- My answer doesn't, because it refers to `sizeof`, which is necessarily a byte offset. – Jim Balter Mar 28 '14 at 22:32
  • @Matt: thanks. But I don't undestand the second part of your answer. I see why `int* B = (int*)A; B[6]` may be an out-of-bounds access, but how can `(int*)&A` be correct, if A itself decays into a pointer to A[0]? Wouldn't `&A` become `&(&A[0])` i.e. a pointer to a pointer to an array? – sp00n Mar 30 '14 at 12:08
  • 2
    `A` does not decay when it is the operand of `&` or `sizeof`. – M.M Mar 30 '14 at 20:06
  • Oh, I see... So `&A` has the same value as `A` but a different type! Thanks. – sp00n Apr 11 '14 at 18:53
5

x[i][j] is always exactly equivalent to *(*(x+i)+j) ... but you have to keep in mind that pointer arithmetic takes into account the size of the type pointed to. In

double ary[NROWS][NCOLS];

the size of ary[i] (i.e., *(ary+i)) is NCOLS*sizeof(double). In

double** ptr;

the size of ptr[i] (i.e., *(ptr+i)) is sizeof(double*).

In both cases, ary[i][j] or ptr[i][j], the correct element is fetched.

Jim Balter
  • 16,163
  • 3
  • 43
  • 66
1

Yes. You are going right way but the declaration

double grid[5][5]

and

double **grid; 

are different. First is declaring a 2D array of 25 elements of double type while second is declaring a pointer to pointer to a double type. Both are different. Note that arrays are not pointers.

In first case memory allocation is on stack and it is continuous and hence compiler optimizes grid[i][j] to *(*(grid + i) + j) which is further optimized to *(*grid + (i*5 + j)).

In second case memory is allocated on heap and malloc won't create contiguous memory. In this case compiler optimizes grid[i][j] to *(*(grid + i) + j) but it doesn't further optimized it to *(*grid + (i*5 + j)).

Community
  • 1
  • 1
haccks
  • 104,019
  • 25
  • 176
  • 264
  • Thanks for your answer. Did you mean `*(grid + (i*5 + j))` (without the second asterisk)? – sp00n Mar 30 '14 at 11:38
0

Yes, the address A[i][j] refers to does depend on how A is declared.

The pointer arithmetic used to evaluate array_variable[i][j] depends on the number of columns (on the second dimension): This is because all the elements for each row (first dimension) are stored together, and the number of columns impacts where the data for the second, third, ETC. row is stored.

This URL states that if you have a 2-dimensional array, declared as:

int [NUMBER_OF_ROWS][NUMBER_OF_COLUMNS]

Then:

x = y[a][b]

is equivalent to:

x = *((int *)y + a * NUMBER_OF_COLUMNS + b);

URL: How to use pointer expressions to access elements of a two-dimensional array in C?

This URL states that elements of (a 2 dimensional array) are stored in row-major order: E.G. all the elements of row 1 are stored first.

http://www.cse.msu.edu/~cse251/lecture11.pdf

This URL has a similar calculation, dependent on the size of the second dimension:

http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/pointer.html

addr( & arr[ row ][ col ] ) = addr( arr ) + [ sizeof( int ) * COLS * row ]
                                          + [ sizeof( int ) * col ]
Community
  • 1
  • 1
A B
  • 4,068
  • 1
  • 20
  • 23
0

Additionally to other answers:

when grid is declared, a contiguous block of memory is allocated for 5*5 doubles, no more no less;

It's not totally true (the part with less will always be true), but sometimes you will find, that even though you are accessing memory after the end of an array the program seems to work (does not segfault instantly). This means that sometimes malloc can allocate more memory than you are asking. Of course it does not mean, that you should access this excessive memory.

zoska
  • 1,684
  • 11
  • 23
  • 3
    To phrase it a little more strongly. Accessing elements beyond the end of the array is `undefined behavior`, i.e. don't do that. – user3386109 Mar 27 '14 at 18:34
0

Q In C, does the meaning of A[i][j] depend on how A is declared?

A No, it does not.

If you have,

int a[3][3] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

then, a[i] is evaluated as *(a+i).

If you have

int ** create2DArray(int d1, int d2)
{
   int i = 0;
   int ** a = malloc(sizeof(*a)*d1);
   for ( ; i != d1; ++i )
   {
      a[i] = malloc(sizeof(int)*d2);
   }
   return a;
}

int **a = create2DArray(3, 3);

then, a[i] is still evaluated as *(a+i).

No matter how they are declared, a[i][j] is evaluated as *(a[i] + j).

You can still call a function like:

void func(int* p, int size)
{
   int i = 0;
   for ( ; i != size; ++i )
   {
      printf("%d ", p[i]);
   }
   printf("\n");
}

using

func(a[i]);

no matter how a was defined.

The main difference between the first definition of a and the second definition of a is how memory is allocated for it. In the first case, memory is allocated on the stack, and it is contiguous. In the second case, memory is allocated on the heap, and it is most likely not contiguous. However, how you use it and how the compiler treats it are exactly same.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • *In the second case, memory is allocated on the heap, and it is most likely not contiguous*: [**No**](http://stackoverflow.com/q/625270/2455888). – haccks Mar 27 '14 at 20:23
  • 1
    @haccks memory allocated in a function like `create2DArray` is not likely to be contiguous since there are `d1+1` calls to `malloc`. Don't you agree with that? – R Sahu Mar 27 '14 at 20:36
  • 1
    It is correct that `a[i][j]` is always identical to `*(a[i] + j)`, but that does NOT mean that its meaning does not depend on the type of `a`. The meaning of `a[i][j]` (and equivalently `*(a[i] + j)`) depends very much on the type of `a`. – newacct Mar 28 '14 at 08:54
  • @newacct Do you really think the meaning depends on whether it is defined as `int a[3][3]` or `int **a`? Please elaborate. – R Sahu Mar 28 '14 at 15:18
  • 1
    No, the meaning is not dependent on the type, only the implementation is. The meaning is the semantics given in the C standard, and that semantics is type-independent. Consider that a[i][j] is also equivalent to i[a][j] and j[i[a]], regardless of the types of a, i, and j. – Jim Balter Mar 28 '14 at 22:48
  • @RSahu, @JimBalter: Yes, the semantics very much depends on type. `a[i]` is always equivalent to `*(a + i)`. Consider the expression `a + i`. If `a` is an expression of array type, then there is an implicit conversion to an rvalue of pointer type, with value being the address of the first element of the array. Then the addition is done on the resulting pointer value. If `a` is an expression of pointer type, there is no such conversion. That is the semantic difference. – newacct Mar 29 '14 at 03:01