1

At a low level, the following is a list of differences I've found from looking at a one-dimensional array, and a pointer that references what would be the equivalent of a one-dimensional array. Here is the Compiler Explorer showing the differences I've found, and the code below:

#include <stdio.h>

void function(void) {

    // 1. array has elements stored in contiguous-memory
    //    whereas ptr is single 8-byte memory address value
    int nums[] = {1,2,3};
    int* ptr_nums = &nums[0];

    // 2. compile-time sizeof is different
    //    ptr will always be 8 bytes, arr = unit_size * size
    printf("Sizeof: %zu", sizeof(nums));
    printf("Sizeof: %zu", sizeof(ptr_nums));

    // 3. assignment-to-ptr is valid, but not array
    //    by extension, arithmetic on ptr is allowed, not array
    // nums += 1;   // invalid
    ptr_nums = ptr_nums + 2;

    // 4. string-literal initialization is 'literal' rodata value
    //    for pointer, but contiguous chars in memory for array
    char name[] = "ABC"; // something like: mov $6513249, -24(%rbp)
    char* name_ptr = &name[0]; // will not create string literal
    char* name_ptr2 = "QCI"; // pointer to rodata string literal

    // 5. address-of operator
    // &array returns address of first element
    // &ptr return the address of pointer
    // (which *would not* be the same as the first element of the array if it pointed to that)
    printf("%zd", &nums);
    printf("%zd", &ptr_nums);

}

Are there any other differences that I may be missing?

David542
  • 104,438
  • 178
  • 489
  • 842
  • @EricPostpischil You are right, I am deleting my comment to avoid confusing other readers. I guess I have a gap in my understanding of how memory is allocated for pointers. A similar initialization for an `int` would not work. – Zois Tasoulas Feb 02 '21 at 03:07

1 Answers1

3

I am puzzled by what is the purpose of the question. It’s like asking “what is the difference between an int and a struct” - seems entirely arbitrary and the answer of little use. Arrays are not pointers. That’s all. The decay doesn’t somehow link them inseparably, it’s just a convenience: it just lets you use the name of the array in place of the pointer to the first element of the array, in many contexts where a pointer would fit.

Obviously, such a “decayed” pointer is not an lvalue, so you can’t modify it: it’s phantom. Your question seems to be more about “how do lvalues and rvalues differ, and how can I tell” - and you have clearly answered that. Trying array += 1 and seeing it fail is equivalent to trying 5 += 1. You can’t expect anything else, it’d make no sense. In C, an array is not an lvalue, it’s a sort of a bastard, since once you have it in scope, you can’t use it for much: only sizeof and & of the array itself. For everything else, it decays to an rvalue pointer. Note: not a pointer to rvalue, for you can’t have one. The pointer itself is an rvalue. Eg. &(foo[1]) first decays the array, since it has no other use, and then does pointer arithmetic as-if foo was a pointer. Rvalues are immutable, and have no storage, ie. you can’t take their address, among other things.

Again: an array is not an rvalue. An array is a value with storage, but there’s very little syntax that can actually operate on it. C helps out and decays the array when you attempt to use it as if it were a pointer, but that pointer does not exist as an lvalue that you could change. It’s only an rvalue, synthesized on the fly, just as integer literals synthesize rvalues on the fly: you can use them, but only to the extent that rvalues can be used.

This fundamental difference between rvalues and lvalues is among the foundations of the language, and it’s very hard to make much sense of C without having firm and absolute grasp of that concept

To further confuse things, the array definition syntax doesn’t always define an array. For example:

#include <assert.h>

void foo(int notAnArray[10]) {
  int anArray[10];
  assert(sizeof(notAnArray) != sizeof(anArray));
}

void bar(int *notAnArray) {
  int anArray[10];
  assert(sizeof(notAnArray) != sizeof(anArray));
}

C’s semantics dictate that foo and bar are identical (other than their name): the two are just different syntaxes that have identical meaning. Worse yet, there are cases where the first syntax may arguably have some self-documenting uses, even though it’s otherwise completely bonkers.

Kuba hasn't forgotten Monica
  • 95,931
  • 16
  • 151
  • 313
  • thanks, could you please clarify what you mean by `“decayed” pointer is not an lvalue` -- you can modify the pointer address itself though, right? Or do you mean the array now isn't an lvalue? – David542 Feb 02 '21 at 02:58
  • 1
    @David542: `ptr = array + 2` is legal because it uses `array` as an rvalue in the right hand side (doing pointer math exactly equivalent to `&array[2]` - that's literally how the `[]` operator is defined), but `array = array+2` or `array += 2` aren't legal because `array` isn't an lvalue. – Peter Cordes Feb 02 '21 at 03:02
  • lvalues and rvalues are fundamental concepts - look them up. “You can modify the pointer address itself though” - no. You can never do that. You can change the value of a pointer, not its address. A value of a pointer is the address of whatever it points to. The address of a pointer is the immutable storage address where that pointer resides. You can change it just the same as you can’t change the address of an array, or of any other variable in fact. In C and C++ world, references are immutable, and the name of a variable is a reference. It can’t be made to refer to something else. Ever. – Kuba hasn't forgotten Monica Feb 02 '21 at 03:02
  • @Kubahasn'tforgottenMonica "*such a “decayed” pointer is not an lvalue*" This could use some clarification for "*such*", since "*decayed pointers*" can mean different things in different contexts. Technically when calling `strlen("hello")` what gets passed as the argument is a "*decayed pointer*", though that's a full fledged pointer in its own. – dxiv Feb 02 '21 at 03:08
  • @Kubahasn'tforgottenMonica I see, thanks for the clarification. What I meant wasn't clear but being able to advance the address a pointer refers to, such as by doing `int * start; ... start++` to advance the pointer to the next (array) element. – David542 Feb 02 '21 at 03:11
  • @dxiv Not quite. First you get an rvalue array literal, then it decays to a pointer rvalue, and that is then assigned to the parameter of `strlen`. The pointer lvalue only exists inside `strlen`, the same as the integer parameter when calling `abs(1)`: whatever object you deal with inside of `abs` is *not* the same object that was passed as a parameter: an rvalue-to-lvalue assignment is done behind your back to make it all work and be practical ;) – Kuba hasn't forgotten Monica Feb 02 '21 at 03:18
  • @David542 The pointer is an lvalue so you can change it. The fact that it happens to point to some array element is completely irrelevant. It behaves like an lvalue: you can change it. The name of an array is interpreted as one of two things: an rvalue pointer to the first element of the array, or a means of taking the size and address of the array. To further the confusion, the array syntax when used in function parameter lists **does not declare any arrays**. E.g. `void myfun(char foo[10])` does not involve arrays in any way whatsoever. It’s a redundant syntax for `char *`. Nothing else. – Kuba hasn't forgotten Monica Feb 02 '21 at 03:23
  • Basically, you need to understand how rvalues differ from lvalues, and in which contexts they arise, and then just remember that in C the name of the array never acts like an lvalue, but unfortunately array syntax doesn’t always declare an array (it declares an lvalue pointer in function parameter list, but an array when it’s a usual variable declaration). C’s syntax is arcane, and a result of the early evolution of the language and convenience of compiler writers. Arrays are very limited types in C and nothing much is done to them once they begin their life. In context it makes sense. – Kuba hasn't forgotten Monica Feb 02 '21 at 03:30
  • You may wish to comment on the fact that `int arr[]` for a function argument is slightly different [it is treated as a pointer]: `int func(int arr[]) { int count = 0; for (; *arr != 0; ++arr) count += 1; return count; }`. And, `for (; arr[0] != 0; ++arr)` also works. – Craig Estey Feb 02 '21 at 03:33
  • 1
    On the other hand, there are languages where they got tired of rvalues and being on the pedestal and they are objects with storage and lvalues, too. The literals then become just references to those global objects. Yeah, some languages let you set the value of the global object `42`, for example. You can make `42` refer to zero or something else. Fun, eh? If you investigate this you’ll learn that those languages are not esoteric at all. Mainstream, even. But you should enjoy figuring it out on your own! :) – Kuba hasn't forgotten Monica Feb 02 '21 at 03:35
  • @Kubahasn'tforgottenMonica I understand where you are coming from, and your use of "*decay*" is probably closest to the strict sense of the standard quoted for example in [this answer](https://stackoverflow.com/a/1462103). However, in less formal speak, "*pointer decay*" is also used to refer to an actual pointer that the array decayed *into*, not just the "*phantom*" abstraction. Common cases are direct assignment `char *p = "hello"` or function calls `strlen("hello")`. In that other thread, most answers focus on this looser sense. That's why "*could use some clarification*", nothing more ;-) – dxiv Feb 02 '21 at 03:36
  • 1
    @dxiv The rvalue pointer :). That’s the distinction that gets people. And also the reason for it: it hides the fact that C doesn’t support array types other than letting you allocate some storage and initialize it and determine how big the storage is as long as the original name is in scope. Other than that, C has no support for arrays whatsoever. The term pointer decay is all fine, but it somewhat hides the fact that C knows pointer arithmetic only. Ergo, the pointer decay is the only way to use arrays, but that rvalue pointer is just like any other rvalue pointer, say `((char*)0x01234567)`. – Kuba hasn't forgotten Monica Feb 02 '21 at 03:53
  • C (unlike C++) is a bit more powerful than your comment might indicate. e.g. you can write `void foo(int n, float arr2d[n][n])` because C99 allows VLAs, and `arr2d` has type pointer-to-*array*, where the size of the pointed-to array is `n`, so `sizeof(arr2d[0])` "works" even though `sizeof(arr2d)` only gives you `sizeof(float*)`. C support for arrays is quite *limited*, for sure though. – Peter Cordes Feb 02 '21 at 04:19
  • @Kubahasn'tforgottenMonica thank you I agree. Out of curiosity, are there are sources/books/articles you can recommend for a deeper understanding of lvalue vs. rvalues? The one beginner C book I've read basically just touched on it in the first few chapters saying something more-or-less like "an l-value is an assignable storage unit, and r-value is a value...so `int x (l) = 4 (r);` – David542 Feb 02 '21 at 04:57