5

These two forms of the same variable, when defined within scope of a function block should, I would think, have identical scope, i.e. within the function blocks {...} where they are defined:

char str1[] = "int_1 < int_2";

char *str1 = "int_1 < int_2";  

But my observation is that the char * lives beyond function scope, while the char [] ceases to exist. The symbol name str1 in both cases points to the location in memory where the variable is created, so why does one seem to live beyond the function, while the other does not? The following code can be used to test this behavior: (Changing #define from 0 to 1 selects one form over the other for illustration.)

Note also that that although the static modifier could be used to modify scope, it is purposely not used here to observe behavior without it.

#define DO (1)  //define as either 1 or 0

char * compare_int(int x1, int x2);

int main(void)
{
    int a = 0;
    int b = 0;
    int c = '\n';

    srand(clock()/CLOCKS_PER_SEC);

    while(c != 'q')
    {
        a = rand()%3;
        b = rand()%3;
        printf("%s\n( enter 'q' to exit. )\n\n", compare_int(a, b));
        c = getchar();
    }
    return 0;
}

char * compare_int(int x, int y) 
{
    printf("%d    %d\n", x, y);

#if(DO)
    char str1[] = "int_1 < int_2";
    char str2[] = "int_1 == int_2";    
    char str3[] = "int_1 > int_2";
#else
    char *str1 = "int_1 < int_2";
    char *str2 = "int_1 == int_2";    
    char *str3 = "int_1 > int_2";
#endif  

    return x < y ? (str1) : x == y ? (str2) : (str3);

}

I have read this, and it does answer some key parts to this question, but comments on any UB in my code, and/or references to C99 or newer standard pointing to paragraph(s) that make the distinctions between these two forms would also be appreciated.

ryyker
  • 22,849
  • 3
  • 43
  • 87
  • How is the provided code an example that pointer and array would reside in different storage/ belong to different scopes? You are an experienced C user. How do these concerns come up? – RobertS supports Monica Cellio Jun 17 '20 at 16:09
  • @RobertSsupportsMonicaCellio - for your 2nd comment, by observation. The pointer lives consistently beyond the life of the function when called in the `rintf()` statement in `main()`, the `char []` does not. It is the reason why this is true that I was after, and was interested in whether UB was involved in my observations. Regarding "_You are an experienced C user. How do these concerns come up_", LOL, I have some experience, but have a long way to go before arriving anywhere near being a perfect C programmer. – ryyker Jun 17 '20 at 16:12
  • 2
    "*The pointer lives consistently beyond the life of the function...*" - No, it doesn't. What is returned is just the address of the first element of the string literal, which itself exists until program termination. It doesn't mean the object of `strN` itself is still alive. – RobertS supports Monica Cellio Jun 17 '20 at 16:21
  • @RobertSsupportsMonicaCellio - well stated clear distinction. Thanks. (some of your observations would add to the answer content here) – ryyker Jun 17 '20 at 16:24
  • 1
    I didn't quite understood what you exactly did asked for because I was confused and thought about which observations you mean. That is why I didn't made an answer. I thought about that you might found some wicked memory hack. – RobertS supports Monica Cellio Jun 17 '20 at 16:38
  • Related: [What is the difference between `char s[]` and `char *s`?](https://stackoverflow.com/questions/1704407/what-is-the-difference-between-char-s-and-char-s) – RobertS supports Monica Cellio Jun 17 '20 at 16:39
  • Perfection is subjective. ;-) – RobertS supports Monica Cellio Jun 17 '20 at 17:49

5 Answers5

6

This:

char str1[] = "int_1 < int_2";

Defines an array initialized with the given string literal. If you return str1, because the array name decays to a pointer to its first element, you're returning a pointer to a local variable. That variable's lifetime end when the function returns, and attempting to subsequently use that address invokes undefined behavior.

This is documented in section 6.2.4p2 of the C standard:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

In contrast, this:

char *str1 = "int_1 < int_2";  

Defines a pointer which is initialized with the address of a string literal. String constants have full program lifetime, so reading a pointer to one is safe. When you return str1 in this case, you're returning the value of str1 (not its address) which is the address of the string literal.

The lifetime of string literals is specified in section 6.4.5p6 of the C standard:

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.

And static storage duration is defined in section 6.2.4p3:

An object whose identifier is declared without
the storage-class specifier _Thread_local, and either with external or internal linkage or with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.

dbush
  • 205,898
  • 23
  • 218
  • 273
2

In these declarations with automatic storage duration within a function

char str1[] = "int_1 < int_2";

char *str1 = "int_1 < int_2";

the both identifiers have the same function scope and are not alive outside the function.

That is the memory occupied by the array and by the pointer themselves will not be valid after exiting the function. For example it can be overwritten.

The difference is that the pointer str1 points to a string literal that have static storage duration. So you may return the pointer from the function because the string literal will be alive and the returned pointer will point it.

As for the array str1 then it is initialized by the string literal (by copying elements of the string literal in its own elements) but it itself has the automatic storage duration. So you may not use the array designator as a return expression because the returned pointer will be invalid due to the fact that the array will not be alive after exiting the function.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • These two statements seem to disagree with each other: _both identifiers have the same function scope and are not alive outside the function_., and _the pointer str1 points to a string literal that have static storage duration_. Does `static` not mean life of program? – ryyker Jun 17 '20 at 16:05
  • @ryyker The idenifier that denotes the pointer have automatic storage duration. The function returns a copy of it. But the string literal has static storage duration. So the returned value is valid because it is an address of an alive object. – Vlad from Moscow Jun 17 '20 at 16:07
  • Then you did not really mean to say _both identifiers have the same function scope and are not alive outside the function_? – ryyker Jun 17 '20 at 16:08
  • @ryyker You are wrong. The two objects, the pointer and the array, are local variables that have a function block scope with the automatic storage duration. – Vlad from Moscow Jun 17 '20 at 16:10
  • It was a question - not a statement. I just do not understanding what seem to be contradictions in your statements. – ryyker Jun 17 '20 at 16:28
  • 1
    @ryyker All local variable with automatic storage duration have their life-time that ended after exiting their scopes. The pointer and the array in your example have the same scope. So after exiting the function they are not alive. So there is no difference between them in this aspect. – Vlad from Moscow Jun 17 '20 at 16:33
1

If you return a pointer from a called function, you don't return a reference to the pointer itself.

Instead, the value of the pointer - actually the address of the first element of the string literal, here f.e. "int_1 < int_2", assigned to it - is returned by value, but not the pointer itself by reference.

The string literal itself resides in read-only memory until the program terminates.


In fact, both, the pointer to char (char *) and the array of char (char[]) have the same storage class auto and are visible to the function compare_int only (have function-local scope).

After the function has been executed once, they both no longer exist (in memory) and thus also aren't visible anymore.

The value used in the printf() call is actually the address of the first element of the string literal passed by value. It has nothing to do with the pointer in the called function, here strN.

The string literal is not bond to a specific pointer.

Would they have been qualified with the storage-class specifier static, then their objects would keep existing in memory until program termination, retain their values through different function calls and are visible everywhere you got a reference to their actual objects by passed pointers to them in the caller(s).

But even then, the returned pointer is not a reference to the pointer itself, it's value - the address of the first element of the string literal - is returned by value.


You can imagine this even better, if you think of the pointer in the called function as "holder" or even better a "delivery person", like the one who friendly delivers your goods from Amazon. S/he holds the address only for a certain amount of time, but thereafter s/he gives the value to another person.

This is analogously happen when returning the address value from compare_int. The pointer in the called function strN is giving the address value to the caller. There it is taken as argument to printf().

0

With char[] your local variable - allocated on the stack - is an array of characters. What you return from the function is a pointer to this local variable. The local variable however, is gone as soon as you return from the function and the pointer you return keeps pointing to a place on the stack which sooner or later is overwritten by some other stuff.

With char * your local variable is just a pointer, pointing to a constant which lives forever. You return a copy of this pointer, which is perfectly ok.

Returning a pointer to a local variable on the stack is the kind of bug that will drive you crazy because in the first place the returned string may appear to be ok, but some random time later it will be overwritten. You'll keep debugging the code that you suspect to destroy the string, unaware that the bug was set much earlier, while creating the string.

C.B.
  • 208
  • 2
  • 7
0

So, I executed your code in CodeBlocks IDE and here's what I found:-

CASE 1

In compare_int(int x, int y), When x = 2 & y = 0, before declaration of str1, str2, and str3 character arrays(char []), str1, 2 & 3 contains some garbage values, then they are declared locally in the function at some memory addresses and values are assigned to them.

Then according to the condition, address of str3 is successfully returned to the calling function, which is a char * to that address. But when in printf("%s\n( enter 'q' to exit. )\n\n", compare_int(a, b)); printf() tries to read that address via char * returned by the compare_int() function, it finds some garbage value there. That means values of str1[], str2[], and str3[] arrays are local to that function and aren't persist outside that complex_int() function.

CASE 0

When str1, str2, and str3 are declared and defined as constant strings, their value is returned by the function and successfully read by the printf(), which means their value persists between different function calls.

Conclusion:

According to this behaviour, I can say that in case 0, string literals are stored in permanent storage area and a local char * which is stored in the stack points to that string literal, when we return this char *, it returns its value which is address of that string literal that's why it is printed successfully. On the other hand, in case 1, char [] are stored in stack and are local to the parent function and their value doesn't persist between different function calls. That's why they aren't accessible from main().

Shubham
  • 1,153
  • 8
  • 20