17

So, I always knew that the array "objects" that are passed around in C/C++ just contained the address of the first object in the array.

How can the pointer to the array "object" and it's contained value be the same?

Could someone point me towards more information maybe about how all that works in assembly, maybe.

Jainendra
  • 24,713
  • 30
  • 122
  • 169
Jules G.M.
  • 3,624
  • 1
  • 21
  • 35
  • 8
    By the way you described it, you might not have the whole picture. Arrays decay into a pointer to the first element in specific cases. Being passed into a function is one of those cases. – chris Nov 26 '12 at 02:08
  • The first object in the array is located in the space occupied by the array. At the lowest memory address. So of course the addresses of the array and its first element are the same. – Daniel Fischer Nov 26 '12 at 02:10
  • If you pass an array to a function, would you perform memberwise copy of all the elements within? Surely not. So just the pointer to the first index is copied and access is given to the whole array. – Coding Mash Nov 26 '12 at 02:11
  • 2
    Arrays in C/C++ are not *"objects"*. In memory, they're represented merely by their elements. And in code, only conceptual rules and few simple syntaxes hold them together as what we recognize as arrays. – antak Nov 26 '12 at 02:24
  • @Julius: I think what you are thinking of as being called an "array" is more like `std::vector` in C++, not arrays. – user541686 Nov 26 '12 at 02:27
  • 4
    @antak Arrays are objects. They don't support assignment by value but everything else works fine. – Potatoswatter Nov 26 '12 at 03:14
  • There are only lvalues of array types, no rvalues, hence the whole confusion. (Trying to make a rvalue out of an array, obviously, leaves you with a pointer to the frist member.) – Kos Nov 26 '12 at 09:46

6 Answers6

17

Short answer: A pointer to an array is defined to have the same value as a pointer to the first element of the array. That's how arrays in C and C++ work.

Pedantic answer:

C and C++ have rvalue and lvalue expressions. An lvalue is something to which the & operator may be applied. They also have implicit conversions. An object may be converted to another type before being used. (For example, if you call sqrt( 9 ) then 9 is converted to double because sqrt( int ) is not defined.)

An lvalue of array type implicitly converts to a pointer. The implicit conversion changes array to &array[0]. This may also be written out explicitly as static_cast< int * >( array ), in C++.

Doing that is OK. Casting to void* is another story. void* is a bit ugly. And casting with the parentheses as (void*)array is also ugly. So please, avoid (void*) a in actual code.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • 2
    The `static_cast` and "casting with the parentheses" parts apply to C++, not to C. – aschepler Nov 26 '12 at 02:22
  • *"A pointer to an array is defined to have the same value as a pointer to the first element of the array"* Where did you get this from? I can't find it in the standard, although I only looked briefly. (I have a feeling it can be derived from knowing `sizeof` behavior of arrays and the fact that arrays are standard layout) – Pubby Nov 26 '12 at 02:23
  • @aschepler For `static_cast`, I mentioned that it's C++ only. Casting with parentheses in C is still not as good as `&array[0]`, and `void` still needs a good justification in C as it's also a source of problems. – Potatoswatter Nov 26 '12 at 02:24
  • Also, is comparing two `voids*` of differing types defined behavior? – Pubby Nov 26 '12 at 02:24
  • 1
    @Pubby Do `void*` really have "types"? – Etienne de Martel Nov 26 '12 at 02:32
  • @EtiennedeMartel I meant the types they were cast from. For instance, what if you were to compare a function pointer cast to `void*` and an `int*` cast on a Harvard architecture? – Pubby Nov 26 '12 at 02:38
  • the point was to compare the place where the information is stored in each case, regardless of their underlying types. – Jules G.M. Nov 26 '12 at 02:41
  • 1
    @NikB. Eh… it's missing standard quotes, Pubby's questions are legitimate but I don't have time right now. And now I notice that all arrays are lvalues so that's actually an irrelevant detail. Rvalue/lvalue only matters because there are no array rvalues (defect in the C standard) so pointer rvalues are used instead. – Potatoswatter Nov 26 '12 at 03:02
  • @Pubby: For question 1, For C, the reason that pointer-to-array and pointer-to-first-element are the same is because "When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object." Elsewhere it says that `void *` and `char *` are basically the same thing. – Dietrich Epp Nov 26 '12 at 03:36
  • @Pubby: For question 2, It is never undefined behavior to compare two valid `void *` pointers. The results are unspecified unless the pointers are equal, or both point to a location in the same block (including the "end" location). – Dietrich Epp Nov 26 '12 at 03:37
  • @Pubby: For question 3, that is a bit different. The C standard does not say that function pointer types can be converted to `void *` at all. It only says that object pointer types can be converted to `void *`. So when you wonder about a function pointer being cast to `void *`, you have already violated the standard before you even get to the comparison. – Dietrich Epp Nov 26 '12 at 03:39
  • @Potatoswatter: "An lvalue of array type implicitly converts to a pointer..." Why "specifically lvalue" ? Actually, both lvalues and rvalues of array type implicitly convert to pointers. The only language that didn't allow *rvalues* of array type to decay to pointers was C89/90 (a rather obscure difference from C++), but it was changed in C99. – AnT stands with Russia Nov 26 '12 at 15:39
12

You are mixing two unrelated (and, actually, mutually exclusive) things, which creates more confusion.

Firstly, you are correctly stating that "array objects that are passed around in C/C++ just contained the address of the first object in the array". The key words here are "passed around". In reality arrays cannot be passed around as array objects. Arrays are not copyable. Whenever you are using an array-style declaration in function parameter list it is actually interpreted as pointer declaration, i.e. it is a pointer that you are "passing around", not the array. However, in such situations your equality does not hold

void foo(int a[]) {
  assert((void *) &a == (void *) a); // FAIL!!!
}

The above assertion is guaranteed to fail - the equality does not hold. So, within the context of this question you have to forget about arrays that you "pass around" (at least for the syntax used in the above example). Your equality does not hold for arrays that have been replaced by pointer objects.

Secondly, actual array objects are not pointers. And there's no need to take the term object into quotation markes. Arrays are full-fledged objects, albeit with some peculiar properties. The equality in question does indeed hold for the actual arrays that have not lost their "arrayness", i.e. array object that have not been replaced by pointer objects. For example

int a[10];
assert((void *) &a == (void *) a); // Never fails

What it means is that numerically the address of the entire array is the same as the address of its first element. Nothing unusual here. In fact, the very same (in nature) equality can be observed with struct types in C/C++

struct S { int x; } a;
assert((void *) &a == (void *) &a.x); // Never fails

I.e. the address of the entire struct object is the same as the address of its first field.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 1
    @Potatoswatter: Well, that's the whole point of the first part of my answer: to explain that arrays that are "passed around" (OP's wording) are not really arrays. The OP seems to believe that "arrays are pointers when they are passed around". That is true (in some sense), so I wanted to get it out of the way first. – AnT stands with Russia Nov 26 '12 at 03:10
3

How can the pointer to the array "object" and it's contained value be the same?

An array is a contiguous block of memory which stores several elements.
Obviously, the first element in the array is located at some address.
There's no data "in between" the first element and the beginning of the actual array.
Therefore, the first element has the same address as the array.

user541686
  • 205,094
  • 128
  • 528
  • 886
  • ok, this helps me understand this array-decay thing. so the lvalue of an array is always &a[0], and it gets decayed to a full on pointer when it's passed to a function. I thought that the memory for the array was created somewhere, and that the pointer to the first element was also created elsewhere at the same time to be passed around. What actually happens is that the array object creates the pointer to its first object when something tries to pass it(the array) around. – Jules G.M. Nov 26 '12 at 02:30
  • 1
    @Julius: Yes, you have most of the the idea, but there's a tiny mistake in your comment: neither arrays nor their addresses are "l-values". They're both r-values. (You can't say `&a[0] = foo`, and you can't even say `a = foo`, so they can't go on the left-hand side of an expression. Therefore they're not *l*-values.) – user541686 Nov 26 '12 at 02:43
  • *There's no data "in between" the first element and the beginning of the actual array.* True, but this is not a good a priori assumption. – Potatoswatter Nov 26 '12 at 03:09
  • @Mehrdad Who says there's no space there? It's not something that you should assume. This answer seems to be using it as evidence to draw the conclusion that the first element has the same address as the array. If anything the logic is the other way around, although I couldn't find the text in the standard just now. – Potatoswatter Nov 26 '12 at 03:12
  • @Potatoswatter: You tell me -- how can you possibly put any space before an array and still make this work in C++? `int a[2][3] = { }; int *p = a; p += 2; p += 2; printf("%d", *p);` – user541686 Nov 26 '12 at 03:19
  • @Mehrdad I agree that it's true. However not every chain of true statements is a valid logical argument. – Potatoswatter Nov 26 '12 at 03:20
  • @Potatoswatter: I understand what you're saying, but no, I'm **not** saying it *happens* to be true. I'm certainly *not* drawing that conclusion from the evidence. Rather, I'm saying it's *impossible* given the well-definedness of the multidimensional pointer arithmetic in the C++ example in my last comment. – user541686 Nov 26 '12 at 03:21
  • @Mehrdad See en.wikipedia.org/wiki/Inference . And that code is not well defined, you used an uninitialized integral value. – Potatoswatter Nov 26 '12 at 03:23
  • @Potatoswatter: Yes, I know what inference means, thank you. You should see http://stackoverflow.com/a/8428835/541686 ... and the uninitializedness of the array was obviously a typo and irrelevant to the point; I fixed it for your reading pleasure though. – user541686 Nov 26 '12 at 03:23
2

Please read the following thread

http://www.cplusplus.com/forum/beginner/29595/

It basically explains that (&a != a) due to the type difference (since &a returns the pointer to the array and a to the first element) even though they both point to the same address.

Since you are casting them both to (void*) only the address value is compared and found to be equal, meaning that ((void*) a == (void*)&a) as you've stated. This makes sense since the array's address has to be the same as the first elements.

nonsensickle
  • 4,438
  • 2
  • 34
  • 61
1

Let's look at these two declarations:

int a[4];
int * b;

Both a and b have a type compatible with int * and can, for example, be passed as an argument to a function expecting int *:

void f(int * p);
f(a); // OK
f(b); // OK

In case of a, the compiler allocates space for 4 int values. When you use the name a, such as when calling f(a), the compiler just substitutes the address of where it allocated the first of those int values, since it knows.

In case of b, the compiler allocates space for one pointer. When you use the name b, such as when calling f(b), the compiler generates code for retrieveing the pointer value from the allocated storage.

When it comes to &, that's when the difference between a and b becomes apparent. & always means the address of the storage the compiler has allocated for your variable: &a is the address of those four int values (therefore coinciding with just a), while &b is the address of the pointer value. They have different types, too.

&a is not exactly the same as a, though, even though they compare as equal. They have a different type: &a is a pointer and a is an array. You can notice the difference, for example, if you apply the sizeof operator to these expressions: sizeof(a) will evaluate to the size of four int values, while sizeof(&a) is the size of a pointer.

Alexey Feldgendler
  • 1,792
  • 9
  • 17
0

Ok, So what I thought happened is that when you created an array, you allocated space for the array somewhere and you created a pointer to its first object somewhere else, and what you passed around in your code was the pointer.

This is actually the behavior of what happens when you create an array with new in C++ or with malloc in C/C++. As such,

int * a = new a[SIZE];
assert((void*)&a==(void*)a); // Always fails

What I learned is that for arrays declared in the style of int a[SIZE];, a pointer to the first element is created when you try to pass the array to a function (this is called array-pointer decay). It's interesting to note that, indeed, as AndreyT writes,

void foo(int a[]) {
     assert((void *) &a == (void *) a); // Always fails
}

This shows that it's only when you try to pass arrays around that a pointer is created for arrays in the style of int a[SIZE];.

Jules G.M.
  • 3,624
  • 1
  • 21
  • 35