0

Consider this code :

char *test() {
    return "HELLO";
}

int main() {
    char *p = test();
    printf("%s\n", p);
}

This compiles without warning and I guess because "HELLO" is not stored in the stack. However this gives me a warning:

char *test() {
    char arr[] = "HELLO";
    return arr;
}

int main() {
    char *p = test();
    printf("%s\n", p);
}

My questions are:

  1. Is it true that string literal are stored in an area called the string literal pool?

  2. If so can the data stored int the string literal pool be considered global?

  3. Is it always safe to return a string literal from a function (since it is kind of global)?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
alessio solari
  • 313
  • 1
  • 6
  • 3
    Returning the address of a string literal is fine. Returning the address of a local array is not. – Tom Karzes Aug 02 '23 at 09:28
  • You need to first allocate memory inside your function, then return the address to this memory block. At the end you should free again the memory allocated. – binaryescape Aug 02 '23 at 09:29
  • 1
    Your `test` function does not return a string literal. It returns the address of the local arrav `arr`. If you had `char *arr = "HELLO";`, then `arr` is the address of the string literal and returning it would be safe. – Jabberwocky Aug 02 '23 at 09:33
  • 1
    Does this answer your question? ["Life-time" of a string literal in C](https://stackoverflow.com/questions/9970295/life-time-of-a-string-literal-in-c) – nielsen Aug 02 '23 at 12:18

2 Answers2

7

String literals have static storage duration. That is they are alive during the program execution.

You may consider the first function

char* test(){
    
    return "HELLO";
}

the following way

char* test(){
    static char arr[] = { 'H', 'E', 'L', 'L', 'O', '\0' };   
    return arr;
}

As for the second program

char* test(){
    
    char arr[] = "HELLO";
    return arr;
}

then there is returned a pointer to the local array arr with automatic storage duration that will not be alive after exiting the function. So the returned pointer will be invalid and dereferencing it results in undefined behavior.

From the C Standard (6.2.4 Storage durations of objects)

3 An object whose identifier is declared without the storage-class specifier _Thread_local, and either with external or internal linkage or with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.

and (6.4.5 String literals)

6 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.78) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
4

It is safe to return string literal because string literals have static storage duration. But string literals may be stored in read-only storage. If caller try to modify returned string, the behavior is undefined. You should specify returned string cannot be modified. Use const qualifier in return type:

const char* test(void) {  
    return "HELLO";
}
dimich
  • 1,305
  • 1
  • 5
  • 7
  • 2
    maybe you want to add that that appiies to the first version of `test()` only – Ingo Leonhardt Aug 02 '23 at 09:36
  • 2
    _"string literals are constant.... You should specify correct return type."_ -- It's a good idea to return `const char*`, but while the storage associated with string literals can't be modified, it isn't `const`. – ad absurdum Aug 02 '23 at 09:56
  • @adabsurdum Could you please clarify your note? The function returns pointer to array of characters which can't be modified. Do you mean the case with `char s[]="string literal"`? – dimich Aug 02 '23 at 10:05
  • 1
    My comment was about the _"correct return type"_ for the code in your answer. A return type of `char *` is a correct and perfectly legal return type since string literals are associated with `char []` storage (not `const char []` storage as learners often seem to think). But `const char *` is still a good idea since that storage can't be modified, as you noted. – ad absurdum Aug 02 '23 at 10:12
  • @adabsurdum Updated. Thanks for your notice. – dimich Aug 02 '23 at 10:23
  • @adabsurdum: the C Standard does not mandate a type `const char[]` for string literals for compatibility reasons only, because it would have broken too many existing programs where string literals are passed to functions taking `char *` or whose address is stored, as in the posted question, into plain `char *` pointers. Yet for new code, it is strongly recommended to use `const char *` for pointers to string literals and use a compiler flag to report risky usage such as `-Wwrite-strings` for **gcc** and **clang**. – chqrlie Aug 02 '23 at 10:35
  • @chqrlie -- agreed. I was only commenting to help avoid a potential misunderstanding in the original wording that could lead learners to believe that string literals are backed by `const char` arrays. – ad absurdum Aug 02 '23 at 10:39
  • 3
    @adabsurdum: Since modifying them has undefined behavior and they are backed by read-only `char` storage on most modern systems, it does not hurt if learners believe that. Give them some time to apprehend the gory historical details. This misunderstanding is quite benign compared to the ubiquitous confusion between arrays and pointers. – chqrlie Aug 02 '23 at 11:03