1

I've been reading C programming Second edition by B & K. In the Appendix A of the book, where they define the standard reference for C, they say:

"A string literal, also called a string constant, is a sequence of characters surrounded by double quotes as in "...". A string has type ``array of characters'' and storage class static and is initialized with the given characters."

This means that any sequence of characters between double quotes have static storage.

Now, in another part of the book, they say:

"Character arrays are a special case of initialization; a string may be used instead of the braces and commas notation: char pattern[] = "ould"; is a shorthand for the longer but equivalent char pattern[] = { 'o', 'u', 'l', 'd', '\0' }; In this case, the array size is five (four characters plus the terminating '\0')."

In the shorthand statement, does the string "ould" have static storage or not? Or is it just an array of characters as they make it out to be.

Some place later in the book, they also say that:

"There is an important difference between these definitions: char amessage[] = "now is the time"; /* an array */ and char *pmessage = "now is the time"; /* a pointer */ amessage is an array, just big enough to hold the sequence of characters and '\0' that initializes it. Individual characters within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents."

So, correct me if I am wrong, what I am basically getting from all of this is that: Once a sequence of characters enclosed in double quotes appear in a program, they stay in memory and aren't destroyed till the end of the program. And also, they are stored in memory as an array of characters, as in char pattern[] = { 'o', 'u', 'l', 'd', '\0' }. And, when a string appears on the right side of an initialization, like in char pattern[] = "ould";, the string is copied into the array, character by character into the array's indices, and stays in memory until the end of program as a sequence of characters too regardless of whether pattern[] is destroyed or not.

I am sorry if this is a bit long but it's kinda been confusing me, I've read other answers on this topic on here too but they don't seem to connect well.

User626468
  • 57
  • 5
  • 1
    `char pattern[] = "ould";` is strictly equivalent to `char pattern[] = { 'o', 'u', 'l', 'd', '\0' };`. There is __no difference__ whatsoever. – Jabberwocky Dec 18 '20 at 15:15
  • 1
    _Once a sequence of characters enclosed in double quotes appear in a program, they stay in memory and aren't destroyed till the end of the program_. Yes. Unless you have `char pattern[] = { 'o', 'u', 'l', 'd', '\0' }`. Then you can modify the `pattern` array, e.g `pattern[0] = 'A';` – Jabberwocky Dec 18 '20 at 15:16
  • 1
    You might be interested in one of my early questions too: https://stackoverflow.com/questions/30533439/string-literals-vs-array-of-char-when-initializing-a-pointer – Eugene Sh. Dec 18 '20 at 15:21
  • 1
    Strictly speaking, the string literal should also be stored, but since there is no way to access the string literal after initialization of the array, the implementation can behave "as is" the literal was stored by not bothering to store it at all. – Ian Abbott Dec 18 '20 at 15:23
  • 1
    In the case of `char pattern[] = "ould";` that is an initialiser, not a string literal because you don't have access to it: the initialising data is copied into the array just like it is with `char pattern[] = { 'o', 'u', 'l', 'd', '\0' };` and because it is not a `const` array it can be modified. OTOH `char *pattern = "ould";` defines a pointer variable that points to a string literal, and the string data can't be modified. – Weather Vane Dec 18 '20 at 15:23
  • 1
    `"when a string appears on the right side of an assignment, like in char pattern[] = "ould";,"` Nope. That is initialization, not assignment. – stark Dec 18 '20 at 15:33
  • 1
    @WeatherVane true, and especially for short strings the initialisation is often done by a couple of MOV instructions rather than a full fledged memcpy/strcpy, in that case the initialisation string exists nowhere in memory. – Jabberwocky Dec 18 '20 at 15:40
  • @Jabberwocky The initializations are equivalent because `'o'`, `'u'`, `'l'` and `'d'` are part of the basic character set and therefore single-byte, but if the string contained any non-single-byte characters, the initializations would not be equivalent. The array initialized by the string literal would be longer. (Hypothetical example: `char pattern[] = "❄️";` versus `char pattern[] = { '', '', '❄️', '\0' };`.) – Ian Abbott Dec 18 '20 at 15:43
  • @IanAbbott Cool `char`s you have there. I still didn't figure it out how you guy are doing this :) – Eugene Sh. Dec 18 '20 at 15:53
  • @EugeneSh. I cut-n-pasted them from Emojipedia! – Ian Abbott Dec 18 '20 at 15:55

1 Answers1

1

Once a sequence of characters enclosed in double quotes appear in a program, they stay in memory and aren't destroyed till the end of the program.

More or less. What you may not realize is that char message[] = "foo"; is not an assignment. It is an initialization.

So when you have:

int func() {
    char message[] = "foo";
    ...

message is a character array with automatic storage which is initialized with 4 characters. In fact it is syntactic sugar for char message[] = {'f', 'o', 'o', '\0'};

The static const "foo" may exist, but it is private to the compiler and it is up to the implementation to know where it lies if any. As a programmer you have no access to it, neither need any access anyway.

If you want to use the static const array, you must use a pointer to store its address:

const char *pt = "foo";

The pointer may be automatic, but the pointed array is static and may safely be returned from a function.

Hope I have not added to the confusion...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252