What happens behind the scenes when i declare the following:-
char a[]="June 14";
and
char *a="June 14";
I mean how memory is allocated for the above two declarations like stack , data segment whatever.
What happens behind the scenes when i declare the following:-
char a[]="June 14";
and
char *a="June 14";
I mean how memory is allocated for the above two declarations like stack , data segment whatever.
The following is for typical C implementations.
char a[] = "June 14";
defines an array that initially contains “June 14” (including a terminating null). These characters are placed into a data section of the executable file that is loaded into memory either when the program begins execution. (Note: The loading may be virtual, in the sense that it is effectively a part of the program’s virtual address space even though it is not physically read into memory immediately.)
char *a = "June 14";
defines two things. The first is a string containing “June 14”. These characters are placed into a data section of the executable file. It may be a read-only (also called constant) data section. The second is a pointer. The pointer is also likely placed in a data section (not read-only). The pointer is initialized with the address of the string. (Depending upon how programs are linked on the target system, the executable file might contain instructions to the system on how to fill in the address at load time rather than containing the actual address itself.)
char a[] = "June 14";
defines an array that initially contains “June 14”. However, this array must be created and initialized each time execution reaches the declaration. (In the file scope declaration, the array only needs to be initialized when the program starts.) In order to accomplish this initialization, the compiler puts “June 14” in a data section of the executable (again possibly a read-only data section). Whenever execution reaches the declaration (or is about to), the compiler allocates some space on the stack and copies the characters from the read-only copy into the new space on the stack.
char *a = "June 14";
is similar to the file-scope declaration, in that the string is placed into a data section (possibly read-only) and a pointer is created. However, the pointer is nominally on the stack instead of in a data section. (I say “nominally” because it is quite likely that optimization results in a pointer such as this being mostly in a processor register rather than actually on the stack, or even optimized away altogether.) The pointer is often initialized via an instruction that calculates the address of the string from other information. (E.g., the loader may set a register to the address where it loaded the start of the data section, so the calculation of the initial value of the pointer would take that address and add to it the offset of the string within the data section.)
Consider the code:
#include <stdio.h>
char a1[] = "June 14";
const char *a2 = "June 14";
void function(void)
{
char b1[] = "June 14";
const char *b2 = "June 14";
printf("%-7s : %-7s : %-7s : %-7s\n", a1, a2, b1, b2);
a1[6]++;
b1[6]++;
a2++;
b2++;
printf("%-7s : %-7s : %-7s : %-7s\n", a1, a2, b1, b2);
}
int main(void)
{
function();
function();
return(0);
}
The array a1
is allocated space and initialized as the program is loaded with the string. The pointer a2
is allocated space for the pointer and that pointer is initialized to point to a copy of the string. I used const
because you can't modify string literals; the value might be stored in the text segment of the program (which is non-writable).
The array b1
is allocated space on the stack and when the function is called, it is initialized with the string. This means, as WhozCraig noted in a comment, that there must be a copy of the string somewhere that is used to initialize the array each time the function is called. The pointer b2
is allocated space for the pointer and that pointer is initialized to point to a copy of the string. Indeed, the compiler might make a2
and b2
point to the same string.
The function prints the values of the four strings; it then modifies the last digit of the number in the two arrays, and increments the two pointers to point to the next character in the string constants, and prints the four strings again. The second call to the function resets b1
to the original string ("June 14"
), but a1
remains changed to "June 15"
at the start of the second call. Similarly, a2
remains pointing at the u
for the first print but at the n
for the second, while b2
points to J
first and u
second. Thus the output is:
June 14 : June 14 : June 14 : June 14
June 15 : une 14 : June 15 : une 14
June 15 : une 14 : June 14 : June 14
June 16 : ne 14 : June 15 : une 14
If there was a static array or a static pointer inside the function, they would behave analogously to the external array a1
and pointer a2
.
There are two things that space is being allocated for here.
char a[]="June 14";
char *a="June 14";
For the object "a" (pointer) and for the constant data "June 14".
Space for the object is allocated on stack if "a" is defined in the context of a function --OR-- in the data segment, if "a" is defined in the static context.
Space for the constant data is allocated in the .rodata (read only data) section.
For,
char a[]="June 14";
gcc on Linux allocates memory in the stack area. Following transformations happens,
- allocate memory for a in stack for size of ("June 14") + 1
- Then copies the "June 14" into the memory area when we call the function
For,
char *b="June 14";
gcc on Linux allocates the memory in the .rodata and stores the "June 14" string in the .rodata section. Then variable b is allocated and it holds the address of the memory area of "June 14" string which is in the .rodata section.
Look at the sample C program for the same,
#include<stdio.h>
main()
{
char a[]="Test Message"; //memory is allocated in the stack and stores "Test Message"
char *b="Test Const String"; //Memory allocated in .rodata section and b points to the memory address
}
I have compiled the above code with gcc with -c options and analyzed the obj file using objdump command. Following is the output, constPointer.o: file format elf32-i386
Contents of section .text:
0000 8d4c2404 83e4f0ff 71fc5589 e55183ec .L$.....q.U..Q..
**0010 24c745eb 54657374 c745ef20 4d6573c7 $.E.Test.E. Mes.
0020 45f37361 6765c645 f700c745 f8000000 E.sage.E...E....**
0030 0083c424 595d8d61 fcc3 ...$Y].a..
Contents of section .rodata:
**0000 54657374 20436f6e 73742053 7472696e Test Const Strin
0010 6700 g.**
Contents of section .comment:
0000 00474343 3a202847 4e552920 342e332e .GCC: (GNU) 4.3.
0010 30203230 30383034 32382028 52656420 0 20080428 (Red
0020 48617420 342e332e 302d3829 00 Hat 4.3.0-8).
I have highlighted(using **) memory allocation done in the different section. In that "Test const String" is placed in the .rodata section which is not modifiable.
So, a[3]='t'; will compile without any error or warning, but b[2]='t'; will result in run time error.
Hope this helps.