2
#include <stdio.h>
#include <string.h>

int main(int argc, const char * argv[]) {
    char str[] = "hello";
    printf("%s, %p", str, str);
    return 0;
}

The code above gives the output

hello, 0x7fff5fbff7aa

What confuses me is that why can str be a string and a pointer at the same time? I know that a string is a pointer to char. So I think str is just a pointer.

But how does the compiler know that %s gives the string that str points to?

Is it how the compiler work?

P.S

I suppose the same thing happens to the situation that when we use %c and %i to a char variable and we get different output.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
jinglei
  • 3,269
  • 11
  • 27
  • 46
  • Strings are not pointers but arrays of characters with terminating null-characters at the end of them. – MikeCAT Aug 16 '16 at 03:19
  • `str` in this code is not a pointer but an array. Arrays can be automatically converted to pointers in expression except for some exceptions such as operands of unary `&` (address) operator and `sizeof` operator. – MikeCAT Aug 16 '16 at 03:21
  • 5
    "I know that" in a question here is always followed by a false statement – M.M Aug 16 '16 at 03:23
  • `%s` is not handled by the compiler but by `printf` in the runtime library. It *assumes* you are passing the correct type, it doesn't know. Pass the wrong type to `printf` and bad things happen (try it). – cdarke Aug 16 '16 at 07:00
  • You might also consider the difference to your code of replacing `char str[] = "hello";` with `const char * str = "hello";` – cdarke Aug 16 '16 at 07:03

5 Answers5

4

In this code str is an array. Arrays and pointers are different. You can make a pointer that points to an element of an array.

In the code printf("%s, %p", str, str); both usages of str actually request a pointer that points to the first element of the array. You could write &str[0] to mean the same thing, but it was a design decision from the start in C that writing the name of an array in most situations would actually request such a pointer.

The printf function is defined so that if it sees %s then it follows (dereferences) the corresponding pointer and prints out characters until it reaches a null terminator. If it sees %p then it prints out some sort of representation of the pointer itself (not what the pointer is pointing to).

M.M
  • 138,810
  • 21
  • 208
  • 365
  • Now I understand the `printf()` part. But I still mix up array and a pointer. I thought `str` store the address of the first character of the string so that it's a pointer variable. – jinglei Aug 16 '16 at 03:49
  • 1
    @penguin-penpen In many situations, evaluating `str` causes it to *decay* into the address of the first character. But the variable itself is an array, not an address/pointer. See http://stackoverflow.com/q/1335786/1530508 – ApproachingDarknessFish Aug 16 '16 at 04:09
  • @penguin-penpen no it doesn't. You thought that by mistake because you didn't know about the rule that such a pointer can be formed to the array `str`. – M.M Aug 16 '16 at 04:17
2

In

char str[] = "hello";

If you think about str as a identifier, it is an array of characters

Arrays and pointer behave differently, for example,

sizeof(array);
// would give you the sizeof(type of array)*total elements in array
sizeof(pointer);
// would give you just the size of the pointer in your system , say 8 bytes

But an array when passed into a function decays to pointer to the first element of the array as in

printf("%s, %p", str, str);
// same as printf("%s, %p", &str[0], &str[0]);

Here str, though it is an array is considered as the pointer to the first element of the array, ie &str[0].

You get different results just because you have used different format specifiers ie %s & %p respectively which decides how content should be printed.

sjsam
  • 21,411
  • 5
  • 55
  • 102
0

It is the printf function that is for printing different values using the same identifier str.

%s and %p are format specifiers. For %s, printf will print a string starting at the address given by str. For %p, printf will print the memory address given by str.

Compiler is playing no special role here. However it can warn you if there is a type mismatch between the format specifier and the corresponding argument. For example if you do printf("%s",10); it can warn about this mismatch saying that printf is expecting a char * - for %s - but you are passing an int.

P.S: Note that str is char array - which is different from a char pointer -, but when you pass it to a function, it becomes a char pointer (char *) pointing to the address of its first element.

sps
  • 2,720
  • 2
  • 19
  • 38
0

This is an area where new C programmers struggle to make friends with what is an array? and what is a pointer? and how is an array transformed to a pointer when the array is passed as a parameter in a function call. All are easily understood, by understanding a few simple fules that are applicable.

To begin with, any variable, (e.g. a = 5;) where a holds the address of an int and the number of bytes that make up that memory hold the immediate value of 5. So when you make the assignment of a = 5;, you are setting the memory pointed to by the variable label a to the value of 5. The key here is that in this sense, all variables may be though of as pointing to somewhere in memory. The difference here is what is located at the memory pointed to by a normal variable, The memory pointed to by a normal variable contains some immediate value (5 here), while in the case of a pointer, a pointer variable points to an address in memmory where the address of something else may be found. (e.g. a pointer is simply a variable that points to the address of some type instead of some value).

The classic example is worth looking at again

int a = 5;    /* 'a' stores a memory address holding the value '5' as its value  */

int *pa = &a; /* pointer-to-a 'pa' stores the memory address of 'a' as its value */

Applied to any array, the first element of an array is the starting address for the entire array. You can think of it this way:

int a[] = { 1, 2, 3, 4 }; /* where &a[0] is the memory address for the array,
                             a[i] = *(a + i), thus &(*(a + 0)), is simply 'a' */

So in your case with char str[] = "hello";, the first element of the character array (which by virtue of how it is initialized (e.g. char array[] = "stuff";) will contain the address for the first character of a nul-termminated array of the characters as given between the opening and closing " of the initializer.

The first element to the array, which is at the address for the entire array, can be referenced by array[0], which as we have seen above is equivalent to *(array + 0), or just *array, and then using the urnary & operator to get the address of array &(*array) is simply array.

That is why you can pass str as both the array itself (the address of the first element) to the format-specifier %s to print as a character string, and why you can pass str to the format-specifier %p to print the pointer address.

Hopefully, you question is answered at this point, but...

That also goes along way to understanding why when the character-array str is passed as a parameter in a function argument list, str (or any array type) is passed as a pointer (i.e the first level of indirection in any array is converted to a pointer when passed as an argument to a function) The historic reason behind why arrays are passed as pointers simply has to do with saving memory. Instead of passing a copy of all elements in an array, only a reference to the address of the first element is required.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

In C, a string is a sequence of character values followed by a 0-valued terminator. For example, the character sequence {'H', 'e', 'l', 'l', 'o', 0} is a string, but {'H', 'e', 'l', 'l', 'o'} is not - that 0 terminator makes the difference.

Strings (including string literals) are stored as arrays of char. Given the declaration

char str[] = "Hello";

you get something like

     +---+
str: |'H'| str[0]
     +---+ 
     |'e'| str[1]
     +---+ 
     |'l'| str[2]
     +---+
     |'l'| str[3]
     +---+
     |'o'| str[4]
     +---+
     | 0 | str[5]
     +---+

in memory. Note that no storage is set aside for a pointer to the first element of the array.

Under most circumstances, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T" and the value of the expression will be the address of the first element of the array. The exceptions to this rule are when the array expression is the operand of the sizeof or unary & operator, or when the expression is a string literal used to initialize an array in a declaration.

So, let's take the following code:

char str[] = "Hello";
char *ptr  = "World";

printf( "%s, %s\n", str, ptr );

The string literals "Hello", "World", and "%s, %s\n" are stored as arrays of char such that they are allocated at program startup and available over the lifetime of the program.

"Hello", "World", "%s, %s\n", and str are all array expressions (they all have type "N-element array of char"). In the declaration of ptr, the "World" array expression is not the operand of the sizeof or unary & operators, nor is it being used to initialize an array of char, so the expression is converted ("decays") to type "pointer to char", and the value of the expression is the address of the first element of the array, so ptr winds up pointing to the first character of "World".

Similarly, in the printf call, the array expressions "%s, %s\n" and str are not the operands of the sizeof or unary & operators, so they too are converted to pointer expressions, and those pointer values are actually what get passed to printf.

However, in the declaration of str, the "Hello" string literal is being used to initialize an array of char, so it is not converted to a pointer expression; instead, str is initialized with the contents of the string literal, and its size is determined by the size of the literal as well.

Here's a concrete memory map for the code above that I generated on my system:

       Item        Address   00   01   02   03
       ----        -------   --   --   --   --
    "Hello"       0x400b91   48   65   6c   6c    Hell
                  0x400b95   6f   00   30   30    o.00

    "World"       0x400b60   57   6f   72   6c    Worl
                  0x400b64   64   00   25   73    d.%s

 "%s, %s\n"       0x400b66   25   73   2c   20    %s,.
                  0x400b6a   25   73   0a   00    %s..

        str 0x7fff7cec1a50   48   65   6c   6c    Hell
            0x7fff7cec1a54   6f   00   00   00    o...

        ptr 0x7fff7cec1a48   60   0b   40   00    `.@.
            0x7fff7cec1a4c   00   00   00   00    ....

The string literal "Hello" is stored starting at address 0x400b91, "World" is stored starting at address 0x400b60, and the format string "%s, %s\n" is stored starting at address 0x400b66 (for whatever reason, the compiler put "World" and "%s, %s\n" right next to each other).

The array str is stored starting at address 0x7fff7cec1a50, and it contains a copy of the contents of the string literal "Hello". The pointer ptr is stored starting at address 0x7fff7cec1a48 and contains the address of the string literal "World" (x86 stores multi-byte values like pointers in little-endian order).

The printf call will receive the pointer values 0x400b66, 0x7fff7cec1a50, and 0x7fff7cec1a48. The %s conversion specifier in the format string says "print the sequence of characters starting at address and continue until I see the 0 terminator".

John Bode
  • 119,563
  • 19
  • 122
  • 198