4

How are string literals compiled in C? As per my understanding, in test1, the string "hello" is put in data segment by compiler and in the 2nd line p is assigned that hard-coded virtual address. Is this correct? and that there is no basic difference between how test1 works and how test2 works.

Some code:

#include <stdio.h>

test1();
test2();
test3();

main()
{
    test1();
    test2();
    //test3();
}

test1()
{
    char *p;
    p="hello";
}

test2()
{
    char *p="hello";
}

test3()
{
    char *p;
    strcpy(p,"hello");
}

any reference from C standard will be greatly appreciated, so that I can understand this thing in depth from compiler point of view.

Shahbaz
  • 46,337
  • 19
  • 116
  • 182
xyz
  • 8,607
  • 16
  • 66
  • 90
  • 1
    in both test1 and test2, p should be a `const char *`. test3 crashes because p doesn't point at anything – τεκ Jul 13 '11 at 14:50
  • 4
    The most current draft of the C99 Standard is at http://www.open-std.org/JTC1/sc22/wg14/www/docs/n1256.pdf see section 6.4.5 for "string literals" – pmg Jul 13 '11 at 14:55
  • Why don't you look at the output of the compiler, e.g. with `gcc -S`? – Edgar Bonet Jul 13 '11 at 19:57
  • possible duplicate of [C String literals: Where do they go?](http://stackoverflow.com/questions/2589949/c-string-literals-where-do-they-go) – outis May 11 '12 at 07:35

2 Answers2

5

From the C standard point of view there's no particular requirement about where the literal string will be placed. About the only requirements about the storage of string literals are in C99 6.4.5/5 "String literals":

  • "an array of static storage duration and length just sufficient to contain the sequence" , which means that the literal will have a lifetime as long as the program.
  • "It is unspecified whether these arrays are distinct provided their elements have the appropriate value", which means the various "hello" literals in your example may or may not have the same address. You can't count on either behavior.
  • "If the program attempts to modify such an array, the behavior is undefined", which means that you can't change the string literal. One many platforms this is enforced (if you attempt to do so, the program will crash). On some platforms, the change may appear to work so you can't count on the bug being readily evident.
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
1

Your understanding is correct, the data of "Hello" will be put in a RO segment, and its relative virtual address will be assigned to the pointers in the testX() functions.

However, those are compiler-specific perspectives, the C standard doesn't care about them.

EDIT: Per test3(), see τεκ's comment.

Assaf Levy
  • 1,312
  • 1
  • 12
  • 20
  • Could you please elaborate a bit on "relative virtual address". How is this different from "virtual address"? and what are these 2 anyway for that matter? Some example..I think that this touches upon some fundamental things.. – xyz Jul 13 '11 at 14:59
  • Here's a nicely laid article almost 10 years old but still relevant and informative: [Win32 Executable Format](http://msdn.microsoft.com/en-us/magazine/bb985992.aspx). It's not such a long read and will equip you with an important understanding of such things. – Assaf Levy Jul 13 '11 at 15:20