What's the difference between int array[] and int* array and where is address of array stored (C)?

Question

Suppose we have a program like this

int main() {
    int array[3] = { 1, 2, 3 };
    int* ptr = array; // <--- Didn't have to use a "&"
    printf("%d\n", *array);
    printf("%d\n", *ptr);
    
    return 0;
}

We would expect to get:

1
1

My questions are

I read here that the "array" is not "lvalue". What does this mean?
Is the "array" just a name for a block of memory? If it is, where is the address of that block stored? int* ptr = array implies that the address of the "array" block has to be stored in the "array", right?
How is it different from something like this? Isn't the "point" also a name for a block of memory?

struct Point { int x; int y; };
int main() {
    struct Point point = { 1, 2 };
    struct Point* ptr = &point; // <--- Have to use a "&"
    printf("%d\n", point.x);
    printf("%d\n", ptr->x);

    return 0;
}

score 2 · Answer 1 · answered Oct 03 '22 at 19:57

2

While the whole concept of "lvalue" is complicated, in this case it mainly means that you can't assign to it. You can't do array = something;. But you can do ptr = something;, because ptr is an lvalue.
The details of data storage are implementation-dependent, but usually an automatic array will be stored on in the stack frame, just like any other automatic variables.
The difference is that in many contexts, an array "decays" into a pointer to its first element. So when you write

int *ptr = array;

it's equivalent to

int *ptr = &array[0];

answered Oct 03 '22 at 19:57

Barmar

741,623
53
500
612

Is the "decaying" process something that the compiler does or is it a runtime "feature"? I assume that I should learn assembly to fully understand that? – Dusan Djordjic Oct 03 '22 at 20:12
It's something the compiler does. It basically just treats `array` as if you wrote `&array[0]`. – Barmar Oct 03 '22 at 20:15
This should be explained in the chapter on arrays in any C textbook or tutorial. – Barmar Oct 03 '22 at 20:16

John Bode · Accepted Answer · 2022-10-04T14:14:05.297

An lvalue is an expression of object type other than void that potentially designates an object (a chunk of memory that can potentially store values), such that the object may be read or modified. Lvalues may include variable names like x, array subscript expressions like a[i], member selection expressions like foo.bar, pointer dereferences like *p, etc. A good rule of thumb is that if it can be the target of the = operator, then it's an lvalue.

Arrays are weird. An array expression is an lvalue, but it's a non-modifiable lvalue; it designates an object, but it cannot be the target of an assignment. When you declare an array in C like

int a[N];

what you get in memory looks something like this:

   +---+
a: |   | a[0]
   +---+
   |   | a[1]
   +---+
   |   | a[2]
   +---+
    ...

There's no object a that's separate from the individual array elements; there's nothing to assign to that's named a. a represents the whole array, but C doesn't define the = operator to work on a whole array.

Brief history lesson - C was derived from an earlier language named B, and when you declared an array in B:

auto a[N];

you got something like this:

   +---+
a: |   | -------------+
   +---+              |
    ...               |
   +---+              |
   |   | a[0] <-------+
   +---+
   |   | a[1]
   +---+
   |   | a[2]
   +---+
    ...

In B, a was a separate object that stored an offset to the first element of the array. The array subscript operation a[i] was defined as *(a + i) - given a starting address stored in a, offset i words¹ from that address and dereference the result.

When he was designing C Ritchie wanted to keep B's array behavior (a[i] == *(a + i)), but he didn't want to keep the explicit pointer that behavior required. Instead, he created a rule that any time an array expression isn't the operand of the sizeof, _Alignof, or unary & operators, it is converted, or "decays", from type "N-element array of T" to "pointer to T" and the value of the expression is the address of the first element.

The expression a[i] = *(a + i) works the same as it did in B, but instead of storing the address of the first element in a, we compute that address as we need it (this is done during translation, not runtime). But it means you can use the [] subscript operator with pointers as well, so ptr[i] does the same thing:

   +---+                            +---+
a: |   | a[0] (ptr[0]) <------ ptr: |   |
   +---+                            +---+
   |   | a[1] (ptr[1])
   +---+
   |   | a[2] (ptr[2])
   +---+
    ...

And this is why a cannot be the target of an assignment - under most circumstances, it "decays" to a pointer value equivalent to &a[0], and values cannot be the target of an assignment.

You cannot change the address of something - you can only change the value stored at a given address.

^{B was a typeless language - everything was stored as a word.}

That's exactly what confused me, I was imagining it to work as it did in B. Thank you very much. — Dusan Djordjic, Oct 04 '22 at 08:29

score 0 · Answer 3 · answered Oct 03 '22 at 21:13

I read here that the "array" is not "lvalue". What does this mean?

Presumably the author meant that C does not define behavior for whole-array assignment. That is, this does not conform to the language specification:

    int array1[3] = { 1, 2, 3 };
    int array2[3] = array1;      // NOT ALLOWED

    array2 = array1;             // NOT ALLOWED

HOWEVER, that is not consistent with the definition of the term "lvalue" used by the language spec:

An lvalue is an expression (with an object type other than void) that potentially designates an object [...]

The name “lvalue” comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object “locator value”.

(C17, paragraph 6.3.2.1/1 and footnote 65)

In terms of that definition, array is an lvalue. But it is not a modifiable lvalue.

Is the "array" just a name for a block of memory?

Yes, that's a reasonable way to look at it. And quite in line with the above definition of "lvalue".

If it is, where is the address of that block stored?

Why does the address need to be stored anywhere?

int* ptr = array implies that the address of the "array" block has to be stored in the "array", right?

No. It implies that the compiler has to have a way to associate the name array with the storage it represents, so that the compiled program behaves correctly at runtime.

In practice, yes, there needs to be some representation of the location of the array inside the compiled program, but that representation is not part of the C semantics of the program. It is not accessible as a variable, and certainly not from the storage attributed to the array itself. For example, it might exist only as a numeric operand to certain machine instructions.

How is it different from [a variable of struct type]? Isn't the "point" also a name for a block of memory?

Yes, "point" is also a name for a block of memory. And in the terminology of the C specs, both your array and your point, where in scope, are lvalues. An array is not particularly different in this regard from an object of any other type. Every object can regarded as a block of storage, and thus, every variable's identifier can be regarded as a name for a block of storage.

What's the difference between int array[] and int* array and where is address of array stored (C)?

My questions are

3 Answers3