9

I have the following C program:

#include <stdio.h>

int main(){
    int a[2][2] = {1, 2, 3, 4};
    printf("a:%p, &a:%p, *a:%p \n", a, &a, *a);
    printf("a[0]:%p, &a[0]:%p \n", a[0], &a[0]);
    printf("&a[0][0]:%p \n", &a[0][0]);
    return 0;
}

It gives the following output:

a:0028FEAC, &a:0028FEAC, *a:0028FEAC
a[0]:0028FEAC, &a[0]:0028FEAC
&a[0][0]:0028FEAC

I am not able to understand why are &a, a, *a - all identical. The same for a[0], &a[0] and &a[0][0].

EDIT:

Thanks to the answers, I've understood the reason why these values are coming out to be equal. This line from the book by Kernighan & Ritchie turned out to be the key to my question:

 the name of an array is a synonym for the location of the initial element.

So, by this, we get

a = &a[0], and

a[0] = &a[0][0] (considering a as an array of arrays)

Intuitively, now the reason is clear behind the output. But, considering how pointers are implemented in C, I can't understand how a and &a are equal. I am assuming that there is a variable a in memory which points to the array(and the starting address of this array-memory-block would be the value of this variable a).

But, when we do &a, doesn't that mean taking the address of the memory location where the variable a was stored? Why are these values equal then?

haccks
  • 104,019
  • 25
  • 176
  • 264
Koderok
  • 1,903
  • 3
  • 15
  • 18
  • 4
    `&a[0]` and `&a[0][0]` are not identical. – haccks Aug 21 '13 at 15:09
  • 3
    You should use `%p` to print pointers, and cast to `void *`. – unwind Aug 21 '13 at 15:14
  • 1
    Possible duplicate of [Why is the same value outputted for A[0], &A, and *A](http://stackoverflow.com/q/17623556/2455888). – haccks Aug 21 '13 at 15:15
  • @haccks: but then why do I get the same values on printing them? – Koderok Aug 21 '13 at 15:47
  • @PulkitYadav; Read my answer. I have explained it in detail. – haccks Aug 21 '13 at 15:55
  • 3
    @Pulkit Yadav: When you are asking a question about something working that way and not the other way, you have to explain why you find it strange. There's nothing really unexpected in your output, so your question really looks like "why 2+2 is 4". How do you expect people to answer such a question? State what specific issues you have with that output and, more importantly, why you perceive them as "issues" and we'll address them. – AnT stands with Russia Aug 21 '13 at 16:00

8 Answers8

17

They're not identical pointers. They're pointers of distinct types that all point to the same memory location. Same value (sort of), different types.

A 2-dimensional array in C is nothing more or less than an array of arrays.

The object a is of type int[2][2], or 2-element array of 2-element array of int.

Any expression of array type is, in most but not all contexts, implicitly converted to ("decays" to) a pointer to the array object's first element. So the expression a, unless it's the operand of unary & or sizeof, is of type int(*)[2], and is equivalent to &a[0] (or &(a[0]) if that's clearer). It becomes a pointer to row 0 of the 2-dimensional array. It's important to remember that this is a pointer value (or equivalently an address), not a pointer object; there is no pointer object here unless you explicitly create one.

So looking at the several expressions you asked about:

  • &a is the address of the entire array object; it's a pointer expression of type int(*)[2][2].
  • a is the name of the array. As discussed above, it "decays" to a pointer to the first element (row) of the array object. It's a pointer expression of type int(*)[2].
  • *a dereferences the pointer expression a. Since a (after it decays) is a pointer to an array of 2 ints, *a is an array of 2 ints. Since that's an array type, it decays (in most but not all contexts) to a pointer to the first element of the array object. So it's of type int*. *a is equivalent to &a[0][0].
  • &a[0] is the address of the first (0th) row of the array object. It's of type int(*)[2]. a[0] is an array object; it doesn't decay to a pointer because it's the direct operand of unary &.
  • &a[0][0] is the address of element 0 of row 0 of the array object. It's of type int*.

All of these pointer expressions refer to the same location in memory. That location is the beginning of the array object a; it's also the beginning of the array object a[0] and of the int object a[0][0].

The correct way to print a pointer value is to use the "%p" format and to convert the pointer value to void*:

printf("&a = %p\n", (void*)&a);
printf("a  = %p\n", (void*)a);
printf("*a = %p\n", (void*)*a);
/* and so forth */

This conversion to void* yields a "raw" address that specifies only a location in memory, not what type of object is at that location. So if you have multiple pointers of different types that point to objects that begin at the same memory location, converting them all to void* yields the same value.

(I've glossed over the inner workings of the [] indexing operator. The expression x[y] is by definition equivalent to *(x+y), where x is a pointer (possibly the result of the implicit conversion of an array) and y is an integer. Or vice versa, but that's ugly; arr[0] and 0[arr] are equivalent, but that's useful only if you're writing deliberately obfuscated code. If we account for that equivalence, it takes a paragraph or so to describe what a[0][0] means, and this answer is probably already too long.)

For the sake of completeness the three contexts in which an expression of array type is not implicitly converted to a pointer to the array's first element are:

  • When it's the operand of unary &, so &arr yields the address of the entire array object;
  • When it's the operand of sizeof, so sizeof arr yields the size in bytes of the array object, not the size of a pointer; and
  • When it's a string literal in an initializer used to initialize an array (sub-)object, so char s[6] = "hello"; copies the array value into s rather than nonsensically initializing an array object with a pointer value. This last exception doesn't apply to the code you're asking about.

(The N1570 draft of the 2011 ISO C standard incorrectly states that _Alignof is a fourth exception; this is incorrect, since _Alignof can only be applied to a parenthesized type name, not to a expression. The error is corrected in the final C11 standard.)

Recommended reading: Section 6 of the comp.lang.c FAQ.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • I read that `a[i]` gives the address of `ith` row. Now I am thinking that `&a[i]` is identical to `a[i]`. Isn't it? – haccks Aug 21 '13 at 18:50
  • 2
    @haccks: `a[i]` is an array object of type `int[2]`, which is row `i` of `a`. In most contexts the expression `a[i]` decays to a pointer to the first element of that array, i.e., a pointer to the `int` object `a[i][0]`. But in `&a[i]`, since `a[i]` is the operand of unary `&`, it doesn't decay, and `&a[i]` is the address of the array object `a[i]`, and is of type `int(*)[2]`. No, `&a[i]` and `a[i]` are not identical; the former is the address of row `i` of `a`, and `a[i]` is either that row or the address of *the first element of* that row, depending on the context. – Keith Thompson Aug 21 '13 at 19:13
  • 1
    I made a diagram many years ago of the difference between these three pointers (or ones similar enough), which is [here](http://web.torek.net/torek/c/pa.html). Another way to look at this is that they point to the same *base address*, but to a different "size unit" at that base-address. – torek Sep 01 '13 at 05:49
  • OK. Let's continue this again. I have two questions: 1. What do you mean by *`a[i]` is either that row or the address of the first element of that row, **depending on the context***? 2. You said `a` is *equivalent* to `&a[0]` and `*a` is *equivalent* to `&a[0][0]`, can I replace the word *equivalent* by **identical**? – haccks Sep 14 '13 at 16:52
  • 1
    @haccks: `a[i]` is an expression of array type (specifically, it's of type `int[2]`). If `a[i]` is the operand of unary `&` or `sizeof`, it refers to that array object; in any other context, it's implicitly converted to the address of the first element of that array object. 2. I suppose so; they're expressions of the same type and value, but they're written differently. I think "equivalent" works a little better. Is `(2+2)` *identical* to `4`? – Keith Thompson Sep 14 '13 at 18:35
  • @haccks: I see the smiley, but I don't get the joke. Why should I delete that answer, and why did you post here rather than there? – Keith Thompson Feb 08 '14 at 21:28
6

Because all expressions are pointing to the beginning of the array:

a = {{a00},{a01},{a10},{a11}}

a points to the array, just because it is an array, so a == &a[0]

and &a[0][0] is positioned at the first cell of the 2D array.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Tomer W
  • 3,395
  • 2
  • 29
  • 44
6

It is printing out the same values because they all are pointing to the same location.

Having said that,

&a[i][i] is of type int * which is a pointer to an integer.

a and &a[0] have the type int(*)[2] which indicates a pointer to an array of 2 ints.

&a has the type of int(*)[2][2] which indicates a pointer to a 2-D array or a pointer to an array of two elements in which each element is an array of 2-ints.

So, all of them are of different type and behave differently if you start doing pointer arithmetic on them.

(&a[0][1] + 1) points to the next integer element in the 2-D array i.e. to a[0][1]

&a[0] + 1 points to the next array of integers i.e. to a[1][0]

&a + 1 points to the next 2-D array which is non-existent in this case, but would be a[2][0] if present.

Uchia Itachi
  • 5,287
  • 2
  • 23
  • 26
  • 2
    +1 for pointing out that they are three different types which happen to have the same value. – Jonathan Leffler Aug 21 '13 at 15:53
  • Isn't `a` of type `int **`? Dereferencing `a` by `*a` would print the value of `a[0]`, which is of the type `int *`. On the other hand, dereferencing it twice using `**a` would print the value of `a[0][0]`(an integer). – Koderok Aug 21 '13 at 17:35
  • 1
    The *object* `a` is of type `int[2][2]`, which is an array type, not a pointer type. The *expression* `a`, unless it's the operand of unary `&` or `sizeof`, is, after the implicit conversion, of type `int(*)[2]`, or pointer to 2-element array of `int`. `a`, after conversion, is equivalent to `&a[0]` (or `&(a[0])` if that's clearer). @PulkitYadav: No, it's not of type `int**`; there is no `int*` object for it to point to. – Keith Thompson Aug 21 '13 at 17:41
  • @KeithThompson; I think `a` has type `int (*) [2]` (pointer to an integer array of length 2), isn't it? – haccks Aug 21 '13 at 17:47
  • @haccks: Yes, that's what I wrote. I know I edited my comment, but I don't remember exactly how; were you responding to an earlier version that was less clear, or is it still unclear? – Keith Thompson Aug 21 '13 at 18:06
  • @KeithThompson; Still unclear. I think type `int (*)[2]` is different from array type. Am I wrong? – haccks Aug 21 '13 at 18:13
  • 2
    @haccks: `int (*)[2]` is a pointer type, so yes, it's different from the array type `int[2][2]`. The *object* `a` is of type `int[2][2]`. The *expression* `a` is implicitly converted, in most but not all contexts, to a pointer to the array object's first element, yielding an expression of type `int(*)[2]`. But `sizeof a`, for example, is the same as `sizeof (int[2][2])`. – Keith Thompson Aug 21 '13 at 18:17
  • @KeithThompson; Got it. In place of *`a` has type `int (*) [2]`*, it should be *when used as a pointer,`a` has type `int (*) [2]`*. Am I right? – haccks Aug 21 '13 at 18:24
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/35939/discussion-between-keith-thompson-and-haccks) – Keith Thompson Aug 21 '13 at 18:36
  • 1
    (Hit the chat button accidentally.) Rather than "when used as a pointer", I'd say "when used in any context that converts it to a pointer". (I suppose "used as a pointer" is implied by that.) – Keith Thompson Aug 21 '13 at 18:38
5
 +------------------------------+
 | a[0][0]   <--   a[0] <--   a | // <--&a, a,*a, &a[0],&a[0][0] 
 |_a[0][1]_                     |
 | a[1][0]   <--   a[1]         |
 | a[1][1]                      |
 +------------------------------+
Lidong Guo
  • 2,817
  • 2
  • 19
  • 31
3

You know that a is the address of the first element of your array and according to the C standard, a[X] is equal to *(a + X).

So:

&a[0] == a because &a[0] is the same as &(*(a + 0)) = &(*a) = a.

&a[0][0] == a because &a[0][0] is the same as &(*(*(a + 0) + 0))) = &(*a) = a

nouney
  • 4,363
  • 19
  • 31
  • You were right the first time - `a[X]` is equal to `*(a + X)`. The pointer arithmetic already takes into account the size of the type it points to. – Timothy Shields Aug 21 '13 at 16:46
  • "`&a[0][0]` is the same as `&(*(a + 0 + 0))`" No. They don't even have the same type. `&a[0][0]` is the same as `&(*(*(a + 0) + 0)))` – newacct Aug 22 '13 at 10:49
  • @newacct I was not sure about it. At first I've written `&(*(*(a + 0) + 0)))`, but I don't know how to prove that `&(*(*(a + 0) + 0)))` = `a` actually ... How can you do that ? – nouney Aug 22 '13 at 12:03
  • @nouney: Well, they are *not* the same, because they have different types. However, their addresses are the same. First of all, `a` (an array expression) is implicitly converted to `&a[0]` in most contexts, so that's why the first one is true. `&a[0]` has the same address value as `&a[0][0]`, because `a[0]` is an array, and it is obvious from how arrays work in C that this must be true. (Note: if `a[0]` were a pointer this would not be true. So the fact that it's an array is important.) – newacct Aug 22 '13 at 19:10
  • @newacct Oh yes I know they have different types, I just wanted to prove in a "mathematical way" (which seems impossible) that the address of both was the same :) Thanks for the explanation ! – nouney Aug 22 '13 at 20:06
3

A 2D array in C is treated as a 1D array whose elements are 1D arrays (the rows).
For example, a 4x3 array of T (where "T" is some data type) may be declared by: T a[4][3], and described by the following scheme:

                       +-----+-----+-----+
  a ==     a[0]   ---> | a00 | a01 | a02 |
                       +-----+-----+-----+
                       +-----+-----+-----+
           a[1]   ---> | a10 | a11 | a12 |
                       +-----+-----+-----+
                       +-----+-----+-----+
           a[2]   ---> | a20 | a21 | a22 |
                       +-----+-----+-----+
                       +-----+-----+-----+
           a[3]   ---> | a30 | a31 | a32 |
                       +-----+-----+-----+

Also the array elements are stored in memory row after row.
Prepending the T and appending the [3] to a we have an array of 3 elements of type T. But, the name a[4] is itself an array indicating that there are 4 elements each being an array of 3 elements. Hence we have an array of 4 arrays of 3 elements each.
Now it is clear that a points to the first element (a[0]) of a[4] . On the Other hand &a[0] will give the address of first element (a[0]) of a[4] and &a[0][0] will give the address of 0th row (a00 | a01 | a02) of array a[4][3]. &a will give the address of 2D array a[3][4]. *a decays to pointers to a[0][0].
Note that a is not a pointer to a[0][0]; instead it is a pointer to a[0].
Hence

  • G1: a and &a[0] are equivalent.
  • G2: *a, a[0]and &a[0][0] are equivalent.
  • G3: &a (gives the address of 2D array a[3][4]).
    But group G1, G2 and G3 are not identical although they are giving the same result (and I explained above why it is giving same result).
haccks
  • 104,019
  • 25
  • 176
  • 264
  • Sorry, the picture is still not clear. Are you using `mat` and `a` interchangeably? You said `mat` is a pointer to `mat[0]`, but in the figure, you have shown `mat == mat[0]`(doesn't this mean they are equal?). Also, this doesn't seem quite right: "mat[0][0] will give the address of 0th row of array mat[4][3]". Doesn't `mat[0][0]` give the contents rather than an address? – Koderok Aug 21 '13 at 16:34
  • ok, thanks! Could you please also answer my other queries in the previous comment. – Koderok Aug 21 '13 at 16:40
  • Yes. You are right `mat[0][0]` will give content not the address. It was typing error. My bad. – haccks Aug 21 '13 at 16:42
  • If you think 2D array as 1D i.e `a[3][4]` as `a[4]`: having 4 elements each of which are an array of 3 elements then `a` will point to the first element of the array `a[4]`. You can say that now that `a`, `&a` and `&a[0]` are identical but are different from `a[0]` and `&a[0][0]`. `a[0]` is identical to `a[0][0]`. – haccks Aug 21 '13 at 16:51
  • Thanks! It's pretty clear. Only one thing that I need to understand now - for any general array `a`(say, 1-D array), why are `a` and `&a` identical? – Koderok Aug 21 '13 at 17:17
  • Always remember that **Each variable in the program occupies one or more bytes of memory; the address of the first byte is said to be the address of the variable** and also **the name of the array is a pointer to the first element**. `a` is pointer to first element [`a[0]`) while `&a` is pointer to array `a[]` and since *address of the first byte (`a[0]`) is address of variable `a[]`*, both seems to be identical but they are **not identical**! – haccks Aug 21 '13 at 17:42
2

This also means that in C arrays have no overhead. In some other languages the structure of arrays is

&a     -->  overhead
            more overhead
&a[0]  -->  element 0
            element 1
            element 2
            ...

and &a != &a[0]

Mario Rossi
  • 7,651
  • 27
  • 37
2

Intuitively, now the reason is clear behind the output. But, considering how pointers are implemented in C, I can't understand how a and &a are equal. I am assuming that there is a variable a in memory which points to the array(and the starting address of this array-memory-block would be the value of this variable a).

Well, no. There is no such thing as an address stored anywhere in memory. There is only memory allocated for the raw data, and that's it. What happens is, when you use a naked a, it immediately decays into a pointer to the first element, giving the impression that the 'value' of a were the address, but the only value of a is the raw array storage.

As a matter of fact, a and &a are different, but only in type, not in value. Let's make it a bit easier by using 1D arrays to clarify this point:

bool foo(int (*a)[2]) {    //a function expecting a pointer to an array of two elements
    return (*a)[0] == (*a)[1];    //a pointer to an array needs to be dereferenced to access its elements
}
bool bar(int (*a)[3]);    //a function expecting a pointer to an array of three elements
bool baz(int *a) {    //a function expecting a pointer to an integer, which is typically used to access arrays.
    return a[0] == a[1];    //this uses pointer arithmetic to access the elements
}

int z[2];
assert((size_t)z == (size_t)&z);    //the value of both is the address of the first element.
foo(&z);     //This works, we pass a pointer to an array of two elements.
//bar(&z);   //Error, bar expects a pointer to an array of three elements.
//baz(&z);   //Error, baz expects a pointer to an int

//foo(z);    //Error, foo expects a pointer to an array
//bar(z);    //Error, bar expects a pointer to an array
baz(z);      //Ok, the name of an array easily decays into a pointer to its first element.

As you see, a and &a behave very differently, even though they share the same value.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106