1

I'm currently in a C class, and am very confused as to where string literals are stored. I understand that a string is just an array of chars, so something like

char c[5] = {'A','B','C','D', 0};

is equivalent to

char* c = "ABCD";

EDIT: Follow up question:

char c[5] = {'A','B','C','D', 0};

if I now say c+1, this is a pointe to the character 'B'? But is 'B' in the stack? or in the code section of memory?

But I always get confused as to what it means when we say that STRING LITERALS ARE STORED IN THE CODE SECTION OF MEMORY.

I understand the difference between the stack and the heap, but I can't grasp the idea of the code section of memory.

For instance, I get that, in the examples above, c is just a stack variable. Fine. But what if I said c[0]? Is this stored in the code section of memory? Or, in the second example I gave, with char * c = "ABCD", c itself is a stack variable, but it points to chars in the code section of memory?

I am thoroughly confused, and any insight would be greatly appreciated.

Thanks

Evan
  • 1,892
  • 2
  • 19
  • 40
  • 1
    Is this article helpful? [c - Why do I get a segmentation fault when writing to a string initialized with "char *s" but not "char s\[\]"? - Stack Overflow](https://stackoverflow.com/questions/164194/why-do-i-get-a-segmentation-fault-when-writing-to-a-string-initialized-with-cha) – MikeCAT Dec 13 '19 at 17:40
  • 1
    The first two examples are _not_ equivalent contrary to your claim, since in the first case, it is a string _constant_ (likely to be stored in the code section), while the second one is an initialized array. – Ctx Dec 13 '19 at 17:41
  • if you declare this `char c[5] = {'A','B','C','D', 0};` inside function (e.g. main()), it will be in the stack. In this case `char* c = "ABCD";` "ABCD" will be stored in memory when you will compile program. But address of "ABCD" will be stored in stack inside variable `c` (if you declare this variable inside function). – Volodymyr Dec 13 '19 at 17:42
  • Hmm Vladimir yes that makes sense, so what if I declare char c[5] = {'A', 'B', 'C', 'D', 0}; what would c+1 be pointing to? Is the 'B' that it's pointing to in the stack? or in the code? – Evan Dec 13 '19 at 17:44
  • 1
    `c` is stored in the stack. if you do this `c+1` you will get address of the second item 'B'. of course this address will be in stack because variable `c` inside the stack. – Volodymyr Dec 13 '19 at 17:46

2 Answers2

3

is equivalent to

Functionally equivalent in that will be stored in memory somewhere, but the type is not the same (one is an array, the other is a pointer) and where they end up with is (typically) not the same place.

But what if I said c[0]? Is this stored in the code section of memory?

c[0] is just an expression that refers to the first character in the array (or for the pointer, the first one pointed by it). The expression itself is not stored anywhere, but the character it represents is the first one of the array (ditto). If the string ends up in the stack, for instance, then c[0] would refer to that first character will be in the stack.

what it means when we say that STRING LITERALS ARE STORED IN THE CODE SECTION OF MEMORY.

A binary is (typically) composed of several sections. One of them is the one that contains the code (the instructions your CPU will run). There are other sections that can be used to store string literals. How all that works depends on your architecture and your operating system.

Acorn
  • 24,970
  • 5
  • 40
  • 69
  • 1
    Thanks for the response. So, to rephrase my question, if I have char c[5] = {'A','B','C','D', 0}; is the character that c[0] represents stored in the code section of memory? Or on the stack? – Evan Dec 13 '19 at 17:47
  • @Evan You're welcome! As the answer explains, it depends. If `c` is placed in the stack, for instance, then `c[0]` is a byte in the stack (the first one reserved for the `c` array). – Acorn Dec 13 '19 at 17:49
  • Hmm well what do you mean, "if c is placed in the stack"? Like when is an instance that c wouldn't be placed on the stack? – Evan Dec 13 '19 at 17:55
  • https://stackoverflow.com/questions/14588767/where-in-memory-are-my-variables-stored-in-c/14588866 – Volodymyr Dec 13 '19 at 17:56
  • @Evan Well, if it is a global, it won't be in the stack. But even that is a detail of your compiler, not of C itself. – Acorn Dec 13 '19 at 17:58
1

… something like

char c[5] = {'A','B','C','D', 0};

is equivalent to

char* c = "ABCD";

No. The first defines c to be an array of 5 char which will be initialized with the values shown. If this declaration appears outside a function, c will have static storage duration. It will be initialized once when the program begins and will exist for the entire execution of the program. In common C implementations, it will be stored in an initialized-data section. If this declaration appears inside a function, c will have automatic storage duration. It will be initialized each time execution reaches the declaration. How the initial values are stored is up to the C implementation—they might be built into instructions that initialize the array, or they might be stored in a constant-data section so the program can copy them from there to the new instance of c whenever one is created.

The second defines c to be a pointer to a char. The string literal nominally defines an array of static storage duration. In common C implementations, that array, if it is actually needed (optimization could make it unnecessary) will be stored in a constant-data section. c is initialized to point to the first character of that array. If this declaration appears outside a function, c has static storage duration, so it is initialized once when the program starts. If it appears inside a function, c has automatic storage duration, and it is created and initialized each time execution reaches the declaration. In either case, it is initialized to point to the first character of the array defined by the string literal.

EDIT: Follow up question:

char c[5] = {'A','B','C','D', 0};

if I now say c+1, this is a pointe to the character 'B'? But is 'B' in > the stack? or in the code section of memory?

c+1 points to c[1]. If c is defined outside of any function, it is, in common C implementations, in some section the compiler/program uses for static data, so c+1 points into that section. If c is defined inside a function, it is, in common C implementations, on the stack, so c+1 points into the stack. (Note that optimization may make it unnecessary to store all of c or to maintain c+1 as a pointer at all, depending on the context.)

But I always get confused as to what it means when we say that STRING > LITERALS ARE STORED IN THE CODE SECTION OF MEMORY.

A string literal defines an array of static storage duration. In common C implementations, they will be stored in a constant-data section. This different from a code section. They are both read-only, but a code section is executable. Some computer architectures do not have means of distinguishing them, and it is certainly possible to store read-only data in a code section, but it is generally better for there to be separate sections.

As with all things in C, the nominal meaning of source code is just how it must behave in an abstract machine the C standard defines. Compilers are allowed to optimize as long as the resulting program has the same behavior in terms of observable effects, which include visible input and output. Optimization might result in a string that has static storage duration in the abstract machine being very different in the actual program.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312