Yet another "return-local-addr" - "function returns address of local variable"

Question

I know, this is a very, very common question. I have read e.g. this, that, and this too. I used to think that returning the address of a local variable is a very bad idea. I used to think that you should better:

allocate memory via e.g. malloc
make the local variable static or
pass the pointer as an argument

but then I tried this:

#include <stdio.h>
char *  foo_1();
char ** foo_2();

int main() {

    char * p_1 = foo_1();
    char ** p_2 = foo_2();

    printf("\n [%s] \n", p_1);
    printf("\n [%s] \n", *p_2);

return 0;
}

char * foo_1() {
    char * p = "bar";
    return p;
}

char ** foo_2() {
    char * p = "bar";
    return &p;
}

I compile with -pedantic -pedantic-errors and get the expected warning: function returns address of local variable [-Wreturn-local-addr]

but only for foo_2()! foo_1() works fine. Does any one know why and if this undefined behavior?

score 4 · Accepted Answer · answered Mar 04 '15 at 21:08

4

char * foo_1() {
    char * p = "bar";
    return p;
}

Here you are not returning the address of a local object but the pointer value of a pointer pointing to a string literal. String literals have static storage duration and it is fine to return a pointer to a string literal. When foo_1 returns, p object is destroyed (automatic storage duration) but not "bar" (static storage duration).

answered Mar 04 '15 at 21:08

ouah

142,963
15
272
331

I don't think that is standard behavior though. clang does it, but IIRC at least some versions of gcc instantiate local strings on each function call. – technosaurus Mar 04 '15 at 21:15
2

@technosaurus it is standard behavior. *(c11, 6.4.5p6) "[...] The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence."* – ouah Mar 04 '15 at 21:17

score 3 · Answer 2 · answered Mar 04 '15 at 21:10

3

In foo_2() you are returning the adress of the local variable p which is why you get the warning.

In foo_1() you are returning the adress of the literal string, which is fine since it has static storage duration.

answered Mar 04 '15 at 21:10

Tafuri

468
2
7

score 2 · Answer 3 · answered Mar 04 '15 at 21:18

In foo_2(), you're returning the address of a (non-static) local variable. That address becomes indeterminate when the function returns, because the pointed-to object no longer exists -- thus the warning.

In foo_1(), you're returning the value of a local variable. There's no problem at all doing that; it's no worse than:

int foo_3(void) {
    int local = 42;
    return local;
}

which returns the value of local.

In foo_1(), since the variable whose value you're returning happens to be a pointer, you could still invoke undefined behavior if that value were questionable. For example:

int foo_1a(void) {
    char arr[] = "bar";
    char *p = arr; // or equivalently, &arr[0]
    return p;
}

Here you're still returning the value of a local variable (which is fine), but that value happens to be the address of another local variable, so the returned pointer value becomes invalid as soon as the function returns.

A compiler is less likely to warn about foo_1a than about your foo_2, because it's less likely to be able to determine that the value of p when the return statement is executed is problematic. In fact the language does not require a diagnostic for this kind of thing. Compilers can do a reasonably good job of detecting and warning about some but not all instances of undefined behavior.

Bottom line: Your foo_1() function is well behaved. The pointer value it returns is the address of a string literal, which has static storage duration (i.e., it exists for the entire lifetime of the program).

However, since modifying that static array has undefined behavior, it would be wise to return the address as a const char* rather than as a char*, so the caller is less likely to attempt to modify the string literal. The const also serves as documentation for any human readers that the pointed-to value is not to be modified.

score 0 · Answer 4 · answered Mar 04 '15 at 21:11

char * p_1 = foo_1();

is OK. Dereferencing p_1 is not a problem as long as you don't modify anything that p_1 points to since foo_1 returns a pointer to a string literal that is stored in read-only section of the program.

char ** p_2 = foo_2();

is not OK. Dereferencing p_2 is cause for undefined behavior since p_2 points to an object that's been deleted.

Yet another "return-local-addr" - "function returns address of local variable"

4 Answers4

Linked