16

Say I have a simple function that returns a C string this way:

const char * getString()
{
  const char * ptr = "blah blah";
  return ptr; 
}

and I call getString() from main() this way:

  const char * s = getString();

1) According to gdb, the variable ptr is stored on the stack, but the string pointed by ptr is not:

(gdb) p &ptr
$1 = (const char **) 0x7fffffffe688

(gdb) p ptr
$2 = 0x4009fc "blah blah"

Does this mean that "blah blah" is not a local variable inside getString()?

I guess that if it were a local variable, I would not be able to pass it to my main() function... But if it's not, where is it stored? On the heap? Is that a "kind of" dynamically memory allocation implemented by the OS every time it hits on a string, or what?

2) If I use an array instead of a pointer, this way:

const char *getString2()
{
  const char a[] = "blah blah blah";
  return a;
}

the compiler warns me that:

warning: address of local variable ‘a’ returned

(and of course the program compiles, but it doesn't work).

Actually, if I ask gdb, I get

(gdb) p &a
$2 = (const char (*)[15]) 0x7fffffffe690

But I thought that const char * ptr and const char a[] were basically the same thing. Looks like they're not.

Am I wrong? What is exactely the difference between the two versions?

Thank you!

laurids
  • 931
  • 9
  • 24

5 Answers5

12

When you write

const char *ptr = "blah blah";

then the following happens: the compiler generates a constant string (of type char []) with the contents "blah blah" and stores it somewhere in the data segment of the executable (it basically has a similar storage duration to that of variables declared using the static keyword).

Then, the address of this string, which is valid throughout the lifetime of the program, is stored in the ptr pointer, which is then returned. All is fine.

Does this mean that "blah blah" is not a local variable inside getString()?

Let me respond with a broken English sentence: yes, it isn't.

However, when you declare an array, as in

const char a[] = "blah blah";

then the compiler doesn't generate a static string. (Indeed, this is a somewhat special case when initializing strings.) It then generates code that will allocate a big enough piece of stack memory for the a array (it's not a pointer!) and will fill it with the bytes of the string. Here a is actually a local variable and returning its address results in undefined behavior.

So...

But I thought that const char *ptr and const char a[] were basically the same thing.

No, not at all, because arrays are not pointers.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • String literals have type `char []`, they are not `const`-qualified, though it would be nice. – effeffe Dec 22 '12 at 16:16
  • @effeffe You're the second person stating this. It must be true. (What a brainless inconsistency!) –  Dec 22 '12 at 16:20
  • It is true. It's a remnant of the early days when there was no `const`. Before the introduction of `const`, the only type string literals _could_ have was `char[N]` (for appropriate `N`). Changing that later would probably have been considered a too breaking change. – Daniel Fischer Dec 22 '12 at 16:24
  • Bah, what a silly typo, thanks for the fix @DanielFischer. (Yes, I see the reasoning, but we had two major revisions of the Standard since K&r, C89 and C99, I don't understand why this change could not be made.) –  Dec 22 '12 at 16:25
  • Quoting C99 rationale: _"However, string literals do not have the type array of const char in order to avoid the problems of pointer type checking, particularly with library functions, since assigning a pointer to const char to a plain pointer to char is not valid."_ – effeffe Dec 22 '12 at 16:26
  • @H2CO3: Beautifully explained. Thank you! :) – laurids Dec 22 '12 at 16:27
  • @effeffe Now I don't get that. Who on Earth writes `char *ptr = "string";`? Is it just me being ignorant, or does the Standard have a flawed logic? –  Dec 22 '12 at 16:36
  • @H2CO3 The point is more `int libfun(char *input);` and `if (libfun("literal") > 2) something();`. Now, if `libfun` doesn't change the string it should have been declared `int libfun(const char *input);`, but it was written before `const` existed. So changing string literals to be `const char[N]` breaks old code. Too much to be done in the three standards so far. Maybe in the next, who knows. – Daniel Fischer Dec 22 '12 at 22:19
  • @DanielFischer couldn't this be simply fixed by changing the signature of `libfun()` to use `const char *`? –  Dec 22 '12 at 22:20
  • @H2CO3 Sure, but get all the old library writers to fix their code ;) It has been done for the functions in the standard library, but that's a small minority of the library functions. – Daniel Fischer Dec 22 '12 at 22:23
  • @DanielFischer Yep, that's something realistic, but it's unfortunate that a bad concept has been introduced in the Standard to cope with technicalities... –  Dec 22 '12 at 22:27
  • Backward compatibility is a good thing. If you standardise a language that has been around (and successful) for as long as C was when the first standard was made, you must make a lot of compromises, or nobody gives a hoot about the standard. You can adhere to pure principles when creating a new language (but if your adherence is too strong, it will be stillborn), but when dealing with an older one, you have to respect "die normative Kraft des Faktischen". – Daniel Fischer Dec 22 '12 at 22:41
6

I guess that if it were a local variable, I would not be able to pass it to my main() function... But if it's not, where is it stored?

String literals are usually stored in a read-only data section (.rodata). C standard just say they have static storage duration. Therefore you can return a pointer to such literal, but it is not the case of arrays.

In the following example, the object pointed by p1 has static storage duration, whereas the array p2 has automatic storage duration.

char *f(void)
{
    const char *p1 = "hello, world";
    char p2[] = "hello, world";

    return p1; /* allowed */
    return p2, /* forbidden */
}
md5
  • 23,373
  • 3
  • 44
  • 93
  • Thank you too for the details ;) – laurids Dec 22 '12 at 16:30
  • Both `p1` and `p2` has automatic storage duration, while the object pointed by `p1` has static storage duration. (and maybe the function shouldn't be `void` :P) – effeffe Dec 22 '12 at 16:30
2

In your function, the scope of a[] array is within the function getString2(). its local array variable.

const char *getString2()
{
  const char a[] = "blah blah blah";
  return a;
}  

In above case string "blah blah blah" copies fist into a[] and your are trying to return that array with return a statement, but not constant string.

Where as in first code getString() : ptr = "blah blah"; ptr point to memory that has global scope.

const char * getString()
{
  const char * ptr = "blah blah";
  return ptr; 
}

In this case you returns the address of constant string "blah blah" that is legal to do.

So actually its Scope problem.

it helpful to learn about Memory Layout of C Programs and Variable Scope in C.

Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
2

You are right in that they are not the same thing. char a[] is an array formed on the stack, and then filled with "blah.." - inside the function, you have essentially `const char a[15]; strcpy(a, "blah blah blah");"

The const char *ptr = "blah blah blah"; on the other hand is simply a pointer (the pointer itself is on the stack), and the pointer points to the string "blah blah blah", which is stored somewhere else [in "read only data" most likely].

You will notice a big difference if you try to alter something, e.g: a[2] = 'e'; vs ptr[2] = 'e'; - the first one will succeed, because you are modifying a stack value, where the second (probably) will fail, because you are modifying a read only piece of memory, which of course should not work.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
1

They are not the same.

The first is a pointer to a string literal. The pointer itself is in automatic storage. The string is in static, read-only memory. It's immutable.

The second is an automatic (stack) char array (and that return is, as the warning says, not legal).

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625