1

I was going through this QA where it is said that char array when initialized with string literal will cause two memory allocations one for variable and other for string literal.

I have written below program to see how is the memory allocated.

#include <stdio.h>
#include <string.h>

int main()
{
    char a[] = "123454321";
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}

and the output is:

a =0x7ffdae87858e and &a = 0x7ffdae87858e                                                                             
&a[0] =0x7ffdae87858e and a[0] = 1                                                                                    
&a[1] =0x7ffdae87858f and a[1] = 2                                                                                    
&a[2] =0x7ffdae878590 and a[2] = 3                                                                                    
&a[3] =0x7ffdae878591 and a[3] = 4                                                                                    
&a[4] =0x7ffdae878592 and a[4] = 5                                                                                    
&a[5] =0x7ffdae878593 and a[5] = 4                                                                                    
&a[6] =0x7ffdae878594 and a[6] = 3                                                                                    
&a[7] =0x7ffdae878595 and a[7] = 2                                                                                    
&a[8] =0x7ffdae878596 and a[8] = 1

From the output it does not look like we have two separate memory locations for array and string literal.

If we have separate memory for array and string literal, is there any way we can prove array a and string literal stores separately in this scenario?

link to clone: https://onlinegdb.com/HkJhdSHyd

IrAM
  • 1,720
  • 5
  • 18
  • What do you mean by "2 separate storages" exactly? All we have is one array whose contents are the string "123454321". – David Schwartz Jan 20 '21 at 06:09
  • @DavidSchwartz, please refer the answer in the link provided, I was confused with answer, so asked a separate question – IrAM Jan 20 '21 at 06:12
  • Define "allocations". There is one variable `a[]` initialized with a constant string. – dxiv Jan 20 '21 at 06:13
  • All of the below answers are correct in their own way, so shall I just conclude that **its up to the compiler how it wants to store the string literal (either can make a copy or emit compilers own code or it may use same memory as that of array)** – IrAM Jan 23 '21 at 10:46
  • @IrAM First two would be the common implementations, the last one "*or it may use same memory as that of array*" is very unlikely. The array `char a[] = "...";` is an automatic variable, and it would usually be allocated on the stack, which is eminently dynamic and can not be statically initialized. In this case however the compiler *could* maybe determine that `main` is called only once, and the array is accessed only once, so it could technically generate code equivalent to `static char a[] = "...";` but, again, that would be least likely, and not possible at all except in trivial cases. – dxiv Jan 23 '21 at 18:38

4 Answers4

4

char a[] = "123454321";

Technically, the string literal "123454321" is not required to be stored anywhere as such. All that's required is that a[] be initialized with the right values when main is entered. Whether that's done by copying the string from some static read-only memory location, or running code that fills it in some other way is not mandated by the standard.

As far as the standard goes, it would be perfectly acceptable for the compiler to emit code equivalent to the following in order to initialize a[]:

char a[10];
for(int n = 0; n <= 4; n++)
    a[n] = a[8-n] = '1' + n;
a[9] = '\0';

In fact, at least one compiler (gcc) initializes a[] via custom code, rather than storing and copying the literal string.

mov     DWORD PTR [ebp-22], 875770417    ; =  0x34333231  =  '1', '2', '3', '4'
mov     DWORD PTR [ebp-18], 842216501    ; =  0x32333435  =  '5`, '4', '3', '2'
mov     WORD  PTR [ebp-14], 49           ; =  0x31        =  '1', '\0'
dxiv
  • 16,984
  • 2
  • 27
  • 49
  • So we cannot really say we have separate memory for variable and literal , variable will have enough memory to store the literal, literal can be stored somewhere else or not is not defined, is my understanding correct? – IrAM Jan 20 '21 at 06:40
  • The statement `char a[] = "123454321";` defines and initializes *one* variable. How the array gets initialized is not specified or mandated by the standard. Compare this to `const char *p = "123454321"` which defines and initializes a *pointer* variable, while at the same time guarantees that a `const char[10]` exists somewhere which holds the literal `"123454321"` string. – dxiv Jan 20 '21 at 06:44
  • 1
    @IrAM See my answer. You can definitely say there has to be separate memory for the variable and the literal because they can have different contexts. The literal has to be stored somewhere else, though due to C's "as-if" rule, that could be in code. – David Schwartz Jan 20 '21 at 07:31
  • @dxiv, I agree with you , so my today's comment under my question is valid? – IrAM Jan 23 '21 at 10:55
1

You've completely misunderstood the question and answer. The question was about whether the initializer string consumes memory in addition to the actual array. Now the thing is, you cannot observe the initializer string.

It is like there are two sheets of paper. One in the closet with 123454321 written with ballpoint pen. One on the desk - initially empty. Then someone else comes, takes the sheet from the closet, reads the text on it, and writes it on the sheet on the desk using a pencil. Then puts the paper back into closet.

Now you're looking at that sheet on desk saying: "clearly the text 123454321 has not been written twice onto this sheet, hence what do they say about there being two copies?"

  • Your Statement : _The question was about whether **the initializer string consumes memory in addition to the actual array._** My Statement: _char array when initialized with string literal will cause two memory allocations **one for variable** and **other for string literal._** – IrAM Jan 23 '21 at 10:50
  • I am sorry, I don't see any difference or my English is bad? – IrAM Jan 23 '21 at 10:51
1

You can prove it by modifying the code as follows:

int main()
{
    for (int i = 0; i < 2; ++i)
    {
        char a[] = "123454321";

        printf("a = %s\n", a);
        a[3] = 'x';
        a[5] = 'y';
        printf("a = %s\n", a);
    }
}

Output:

a = 123454321
a = 123x5y321
a = 123454321
a = 123x5y321

We got the original string back after modifying it, so the original string must have been stored somewhere other than the place we modified.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • yes, it looks like if i use `printf("a = %s and &a = %p\n", a, &a);`, i see the address is same always though the string is changing , is that undefined behavior or correct behavior – IrAM Jan 20 '21 at 06:25
  • @IrAM The implementation can do it either way. Each iteration of the loop creates a new instance of `a`. The old instance's lifetime is over, so the implementation is free to re-use its storage but it is not required to. – David Schwartz Jan 20 '21 at 06:31
  • This answer is not correct. The compiler does not need to store "123454321". It can just allocate memory initialized with '1','2','3','4'..... for a – Serve Laurijssen Jan 20 '21 at 06:34
  • @ServeLaurijssen How can it allocate memory initialized with particular contents if it does not store that contents somewhere? – David Schwartz Jan 20 '21 at 06:57
  • see divx's answer. it can be custom code. it could generate a series of 'mov', there's no guarantee that an array is stored somewhere and the contents copied into the other array – Serve Laurijssen Jan 20 '21 at 07:04
  • @ServeLaurijssen That custom code would be a form of storing the contents. Code that generates particular values is one way of storing those values. I never said the contents were copied, just that they had to be stored somewhere else in some form. (And if you read the linked question, it is clear that it's not talking about any particular place in any particular format, just some form of storage somewhere. C has the as-if rule, after all.) – David Schwartz Jan 20 '21 at 07:29
  • @DavidSchwartz, there is a confusion here about the statement _We got the original string back after modifying it,_ when `i = 1` the storage(or address) array `a` gets may be same or different does not matter, since it has already lost its scope when `i` changed from `0` to `1` – IrAM Jan 23 '21 at 10:44
  • @IrAM Either way, the literal must have been stored somewhere else or there would have been no way to recover it after it was modified. – David Schwartz Jan 24 '21 at 23:25
0

You cant prove there's two storages because you have only one.

The compiler sees you want a char array initialized with some characters and '\0' so it does that. It does not need to store the string literal somewhere else.

This would not compile for that reason.

#include <stdio.h>
#include <string.h>

char *p = "123454321";

int main()
{
    char a[] = p;
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}
Serve Laurijssen
  • 9,266
  • 5
  • 45
  • 98
  • Then below answer conflicts with this answer – IrAM Jan 20 '21 at 06:27
  • This cannot be correct. The string literal must be stored somewhere in some form or it would be impossible to initialize the array to it. And it can't be stored in only one place because if it was, modifying the array would cause the code to fail if you executed it again. – David Schwartz Jan 20 '21 at 07:33