0

This program works in C:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = a;
    
    printf("%s",b);
}

There are two things I would expect to be different. One is that we in the second line in the main write: "char *b = &a", then the program is like this:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = &a;
    
    printf("%s",b);
}

But this does not work. Why is that? Isn't this the correct way to initialize a pointer with an adress?

The second problem I have is in the last line we should have: printf("%s",*b) so the program is like this:

#include <stdio.h>


int main(void) {
    char a[10] = "Hello";
    char *b = a;
    
    printf("%s",*b);
}

But this gives a segmentation fault. Why does this not work? Aren't we supposed to write "*" in front of a pointer to get its value?

user394334
  • 243
  • 1
  • 10
  • 1
    `char` arrays are already pointers to an array of char. `char a[]` is more or less equivalent to `char* a`, for our purposes here. Consequently, you don't have to take the address of `a` in the first example, and you don't have to dereference `b` in the printf, in the second example. Remove the `&` from `a` in the first example, and remove `*` from `b` in the second example, and your code should work. – Robert Harvey May 27 '22 at 18:36
  • 1
    As for the last snippet - `*b` is the same as `b[0]` which is the same as `a[0]` - the type of it is `char` and the value is `'H'`. Attempting pass it instead of a pointer parameter which is expected for the `%s` specifier is producing the well expected memory violation. – Eugene Sh. May 27 '22 at 18:36
  • `&a;` gives you a pointer to the variable `a` but you need a pointer to the beginning of the char array which is the content of `a`; `&a` is a `char**`. If you did `*b` in the `printf` call it would probably work. The second example doesn't work because `*b` is just the letter `H` and you're passing this as a *memory address of a `char*` (a string)* to `printf`. – Luatic May 27 '22 at 18:37
  • `a` will decompose to a pointer of type `char *`. `&a` is a pointer to a character array of length 10, or `char (*)[10]` – Christian Gibbons May 27 '22 at 18:38
  • 2
    Does this answer your question? [What is array to pointer decay?](https://stackoverflow.com/questions/1461432/what-is-array-to-pointer-decay) – Karl Knechtel May 27 '22 at 18:48
  • @LMD I read that ** means pointer to a pointer, so can you please tell me which pointer &a points to? – user394334 May 27 '22 at 22:16
  • @user394334 `a` is not a pointer. So `&a` does not point to a pointer. `&a` points to an array. – Steve Summit May 28 '22 at 11:34
  • @user394334 You might also be interested in [this question](https://stackoverflow.com/questions/72406136) and [its answer](https://stackoverflow.com/questions/72406136/why-is-type-char-arr-incompatible-with-char-arr-and-how-to-fix-it/72406449#72406449). – Steve Summit May 28 '22 at 11:37

2 Answers2

3

There is a special rule in C. When you write

char *b = a;

you get the same effect as if you had written

char *b = &a[0];

That is, you automatically get a pointer to the array's first element. This happens any time you try to take the "value" of an array.

Aren't we supposed to write "*" in front of a pointer to get its value?

Yes, and if you wanted to get the single character pointed to by b, you would therefore need the *. This code

printf("first char: %c\n", *b);

would print the first character of the string. But when you write

printf("whole string: %s\n", b);

you get the whole string. %s prints multiple characters, and it expects a pointer. Down inside printf, when you use %s, it loops over and prints all the characters in the string.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Thank you very much! I just have one follow up question. If b is only pointing to the first element, how does the computer know that it should print all the elements in the last example? – user394334 May 27 '22 at 18:54
  • 1
    @user394334 Because down inside `printf`, when it's doing `%s`, there's basically this loop: `while(*p != '\0') putchar(*p++);`. – Steve Summit May 27 '22 at 18:59
  • 1
    That is, `printf("%s", p)` is roughly equivalent to `for(int i = 0; i < strlen(p; i++) printf("%c", p[i]);`. – Steve Summit May 27 '22 at 19:00
  • Just one more question(if you have time). Does "&a" give us the adress to the array as a whole, or what does it give us the adress to? – user394334 May 27 '22 at 19:10
  • 1
    @user394334: yes, it give the address of the array as a whole. That will be the same value as the address of the 0th element, but has a different type. – Chris Dodd May 27 '22 at 19:11
  • @ChrisDodd In the comments above it says that &a is a char**, do you agree with this, if yes, what pointer does it point to? – user394334 May 28 '22 at 11:32
  • @user394334 See questions [6.12](http://c-faq.com/aryptr/aryvsadr.html) and [6.13](http://c-faq.com/aryptr/ptrtoarray.html) in the [C FAQ list](http://c-faq.com/). – Steve Summit May 28 '22 at 11:35
  • 1
    @user394334 - no, `&a` is a `char (*)[10]` -- a pointer to the whole array. The comment above is wrong. – Chris Dodd May 28 '22 at 19:33
2

Expanding on Steve's answer (which is the correct one to accept)...

This is the special rule he's talking about:

6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
C 2011 Prepublication Draft

Arrays are weird and don't behave like other types. You don't get this "decay to a pointer to the first element" behavior in other aggregate types like struct types. You can't assign the contents of an entire array with the = operator like you can with struct types; for example, you can't do something like

int a[5] = {1, 2, 3, 4, 5};
int b[5];
...
b = a; // not allowed; that's what "is not an lvalue" means

Why are arrays weird?

C was derived from an earlier language named B, and when you declared an array in B:

auto arr[5];

the compiler set aside an extra word to point to the first element of the array:

     +---+
arr: |   | ----------+
     +---+           |
      ...            |
     +---+           |
     |   | arr[0] <--+
     +---+
     |   | arr[1]
     +---+
     |   | arr[2]
     +---+
     |   | arr[3]
     +---+
     |   | arr[4]
     +---+

The array subscript operation arr[i] was defined as *(arr + i) - given the starting address stored in arr, offset i elements from that address and dereference the result. This also meant that &arr would yield a different value from &arr[0].

When he was designing C, Ritchie wanted to keep B's array subscripting behavior, but he didn't want to set aside storage for the separate pointer that behavior required. So instead of storing a separate pointer, he created the "decay" rule. When you declare an array in C:

int arr[5];

the only storage set aside is for the array elements themselves:

     +---+
arr: |   | arr[0]
     +---+ 
     |   | arr[1]
     +---+
     |   | arr[2]
     +---+
     |   | arr[3]
     +---+
     |   | arr[4]
     +---+

The subscript operation arr[i] is still defined as *(arr + i), but instead of storing a pointer value in arr, a pointer value is computed from the expression arr. This means &arr and &arr[0] will yield the same address value, but the types of the expressions will be different (int (*)[5] vs int *, respectively).

One practical effect of this rule is that you can use the [] operator on pointer expressions as well as array expressions - given your code you can write b[i] and it will behave exactly like a[i].

Another practical effect is that when you pass an array expression as an argument to a function, what the function actually receives is a pointer to the first element. This is why you often have to pass the array size as a separate parameter, because a pointer only points to a single object of the specified type; there's no way to know from the pointer value itself whether you're pointing to the first element of an array, how many elements are in the array, etc.

Arrays carry no metadata around, so there's no way to query an array for its size, or type, or anything else at runtime. The sizeof operator is computed at compile time, not runtime.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • *The sizeof operator is computed at compile time, not runtime.* And so it was until VLA's came along... – Steve Summit May 27 '22 at 22:51
  • @SteveSummit: gah. Yes. Was wondering what part of my answer I would screw up. – John Bode May 27 '22 at 23:06
  • 1
    Well, I'd say it's about an equally likely proposition that your answer is screwed up, or that VLA's are screwed up! :-) (I mean, VLA's shouldn't exist, because it's true, arrays carry no metadata around, and `sizeof` is computed at compile time, and...) – Steve Summit May 27 '22 at 23:16