81
#include<stdio.h>
#include<string.h>

int main()
{
    char * p = "abc";
    char * p1 = "abc";
    printf("%d %d", p, p1);
}

When I print the values of the two pointers, it is printing the same address. Why?

jww
  • 97,681
  • 90
  • 411
  • 885
seereddi sekhar
  • 881
  • 9
  • 20
  • 67
    Why do you think it shouldn't? These both pointers point to the exact same thing. What you're seeing is probably the effect of an optimization technique called string pooling. – Daniel Kamil Kozar Sep 30 '13 at 06:58
  • 2
    Even though the data is same but variables are different . – seereddi sekhar Sep 30 '13 at 07:00
  • 2
    The variables are, of course, different. If you had taken the address of `p` and `p1`, then you would've noticed that these two pointers are stored under two distinct addresses. The fact that their value is the same is - in this case - irrelevant. – Daniel Kamil Kozar Sep 30 '13 at 07:02
  • Yes, if I change the values then the addresses are different. – seereddi sekhar Sep 30 '13 at 07:05
  • I was talking about the **value of the pointer**, e.g. the address that it points to. – Daniel Kamil Kozar Sep 30 '13 at 07:08
  • To clarify: that's `&p` and `&p1` we're talking about. They differ. – MSalters Sep 30 '13 at 07:13
  • @MSalters: No, `p` and `p1`. – Jan Hudec Sep 30 '13 at 07:17
  • 11
    @JanHudec: Read the question again. In this case (due to compiler optimization) `p == p1` (they don't differ) but `&p != &p1` (they do differ). – MSalters Sep 30 '13 at 07:20
  • @MSalters: Well, "address of the two pointers" would certainly mean `&p` and `&p1`, but it appears _not_ to be what was meant. – Jan Hudec Sep 30 '13 at 07:27
  • Thanks all for the clarification – seereddi sekhar Sep 30 '13 at 07:33
  • Read: [C Strings Comparison with Equal Sign](http://stackoverflow.com/questions/17781855/c-strings-comparison-with-equal-sign/17781902#17781902) – Grijesh Chauhan Sep 30 '13 at 08:10
  • What happens when you – Hans Z Oct 01 '13 at 00:19
  • Your program is simple: it assign an identical string to 2 char pointers. If you alter your program to then later on change only "*p1" value but not "*p", I think they will not be optimized to point to the same location anymore... (in your example the compiler notices that, as they are never changed, they can also point to the same location in memory without any ill effect). iow: they should be different only if they can behave differently (in your example they will constantly both point to something of value "abc" and never be altered, so it might as well be the same "abc" string in memory) – Olivier Dulac Oct 01 '13 at 09:55
  • @seereddi The correct title for this question should be "value of two pointers are same" or "addresses pointed to by two pointed to by two pointers are same". Once you realize that, you have the answer for your question! – Amarghosh Oct 02 '13 at 01:39
  • It's not directly related to your question but you should use %p to print a pointer. – klearn Oct 03 '13 at 23:00
  • This looks very obvious to me. Don't know why so much hype about this. – Ganesh Jadhav Oct 07 '13 at 12:38

10 Answers10

89

Whether two different string literals with same content is placed in the same memory location or different memory locations is implementation-dependent.

You should always treat p and p1 as two different pointers (even though they have the same content) as they may or may not point to the same address. You shouldn't rely on compiler optimizations.

C11 Standard, 6.4.5, String literals, semantics

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.


The format for printing must be %p:

  printf("%p %p", (void*)p, (void*)p1);

See this answer for why.

Community
  • 1
  • 1
P.P
  • 117,907
  • 20
  • 175
  • 238
  • I used volatile so that there must be no memory optimizations, even they take the same address. One question is that if i modify one of the pointer, will the data in the other pointed also be modified. – Megharaj Sep 30 '13 at 07:51
  • 8
    @Megharaj `i modify one of the pointer, will the data in the other pointed also be modified` You can modify the *pointer* but not the string literal. E.g. `char *p="abc"; p="xyz";` is perfectly fine whereas `char *p="abc"; p[0]='x';` invokes *undefined behaviour*. This has nothing to do with `volatile`. Whether you use `volatile` or not shouldn't change any behaviour we are interested in here. `volatile` basically forces to read the data from memory everytime. – P.P Sep 30 '13 at 09:03
  • @BlueMoon why p[0]='x' undefined behavior ? – M Sharath Hegde Sep 30 '13 at 12:19
  • 2
    @MSharathHegde Yes. Because `p` points to the string literal `"abc"` and `p[0]='x'` attempts to modify the first char of a string literal. Attempting to modify a string literal is undefined behaviour in C. – P.P Sep 30 '13 at 12:22
  • @BlueMoon No i was just asking what is the rationale behind making such behavior undefined ? As it seems to be OK to change p[0] to 'x'. – M Sharath Hegde Sep 30 '13 at 12:27
  • The compiler is allowed to put string literal constants in read-only memory. Also, making it undefined allows this optimization. Changing p['0'] might or might not change p1['0'] as a side effect. – armb Sep 30 '13 at 12:45
  • 2
    @MSharathHegde Because C standard states that. The reason is mainly historical as pre-standard C language allowed modifying string literals. Later, C standard (C89) made it *undefined* so that new code doesn't do that and old code (pre-standard) works as they were. Basically it's a compromise to not break existing (pre-stanard) code, I believe. Another reason is type of string literal is `char []` in C. So making it read-only (`const char*` as is the case in C++) would require chaning the *type* as well. [contd.] – P.P Sep 30 '13 at 12:46
  • 7
    There's a line in K&R 2nd edition in Appendix C: `"Strings are no longer modifiable, and so may be placed in read-only memory"`, a historical proof that string literals *used* to be modifiable ;-) – P.P Sep 30 '13 at 12:46
  • @l3x: Isn't the behaviour of program posted in OP's question undefined according to C standard because %d expects arguments of type int not int*. He should use %p format specifier? – Destructor Feb 23 '16 at 09:51
  • @PravasiMeet Technically, yes it's undefined. You are right `%p` should be used *and* it must be cast to `void*`. But the core of the question remains the same: even with correct format specifier OP would still see the same behaviour and the answer would still be the same. Anyhow, I updated the post to make this point. Thanks. – P.P Feb 23 '16 at 10:35
28

Your compiler seems to be quite clever, detecting that both the literals are the same. And as literals are constant the compiler decided to not store them twice.

It seems worth mentioning that this does not necessarily needs to be the case. Please see Blue Moon's answer on this.


Btw: The printf() statement should look like this

printf("%p %p", (void *) p, (void *) p1);

as "%p" shall be used to print pointer values, and it is defined for pointer of type void * only.*1


Also I'd say the code misses a return statement, but the C standard seems to be in the process of being changed. Others might kindly clarify this.


*1: Casting to void * here is not necessary for char * pointers, but for pointers to all other types.

Community
  • 1
  • 1
alk
  • 69,737
  • 10
  • 105
  • 255
  • Thanks. So the conclusion is compiler optimization right? in C main function by default returns 0 – seereddi sekhar Sep 30 '13 at 07:03
  • @seereddisekhar: Yes, it's a kind of optimisation. – alk Sep 30 '13 at 07:05
  • 2
    @seereddisekhar But be careful it doesn't means that you should compare two strings (even pointer) using `==` you should use `strcmpy()` function. Because other compiler may be not using optimization (it is upto compiler -- implementation depentednt) as Alk answered PS: Blue Moon just added about it. – Grijesh Chauhan Sep 30 '13 at 07:07
  • What the hell is `strcmpy`? – Daniel Kamil Kozar Sep 30 '13 at 07:14
  • Yes, since C99 main() doesn't have to have an explicit return statement. If return statement is missing then 0 is implicitly returned. – P.P Sep 30 '13 at 07:15
  • @DanielKamilKozar: Copy on compare? Compare on copy? ;-) – alk Sep 30 '13 at 07:26
  • @DanielKamilKozar Sorry misspelled I means string comparison function from string.h correct name is `int strcmp(char*, char*)`. – Grijesh Chauhan Sep 30 '13 at 07:49
  • @alk I used volatile so that there must be no memory optimizations, even they take the same address. One question is that if i modify one of the pointer, will the data in the other pointed also be modified – Megharaj Sep 30 '13 at 07:52
  • @Megharaj: Please be more specific on how you use `volatile`. And as this seems to be a differtn question, raise one on this topic. – alk Sep 30 '13 at 08:27
  • @alk Ya i got it. I was wrong. But i do have a question, I am not able put my code here. suppose if i change the value in one of the pointer say *p = 'b'; will it be changed even in p. Since both the variable are having same address. I tried the same its showing segmentation fault. – Megharaj Sep 30 '13 at 08:35
  • @alk I am adding the code here,int main() { char * p = "abc"; char * p1 = "abc"; printf("%d\n %d\n", (void *)p, (void *)p1); printf("%s\n %s\n", p, p1); *p = 'b'; printf("%d\n %d\n", p, p1); printf("%s\n %s\n", p, p1); } it gives segmentation fault. – Megharaj Sep 30 '13 at 08:37
  • 2
    Dear @Megharaj: May I kindly ask to raise a seperate question on this? You can post a link to this new question here as comment. – alk Sep 30 '13 at 09:07
  • 1
    @Megharaj: You cannot change the value of a string literal. As I mentioned in my question, it is constant. – alk Sep 30 '13 at 09:09
18

Your compiler has done something called "string pooling". You specified that you wanted two pointers, both pointing to the same string literal - so it only made one copy of the literal.

Technically: It should have complained at you for not making the pointers "const"

const char* p = "abc";

This is probably because you are using Visual Studio or you are using GCC without -Wall.

If you expressly want them to be stored twice in memory, try:

char s1[] = "abc";
char s2[] = "abc";

Here you explicitly state that you want two c-string character arrays rather than two pointers to characters.

Caveat: String pooling is a compiler/optimizer feature and not a facet of the language. As such different compilers under different environments will produce different behavior depending on things like optimization level, compiler flags and whether the strings are in different compilation units.

kfsone
  • 23,617
  • 2
  • 42
  • 74
  • 1
    `gcc (Debian 4.4.5-8) 4.4.5` doesn't complain (warn), although using `-Wall -Wextra -pedantic`. – alk Sep 30 '13 at 07:37
  • 1
    Yes, as of V4.8.1 gcc by default does not warn about not using `const` for string literals. The warning is enabled by option `-Wwrite-strings`. It is apparently not enabled by any other option (such as `-Wall`, `-Wextra` or `-pedantic`). – sleske Sep 30 '13 at 10:35
  • 1
    Both GCC 4.4.7 and 4.7.2 give me the warning with or without -Wall. http://pastebin.com/1DtYEzUN – kfsone Sep 30 '13 at 20:15
15

As others have said, the compiler is noticing that they have the same value, and so is deciding to have them share data in the final executable. But it gets fancier: when I compile the following with gcc -O

#include<stdio.h>
#include<string.h>

int main()
{
  char * p = "abcdef";
  char * p1 = "def";
  printf("%d %d", p, p1);
}

it prints 4195780 4195783 for me. That is, p1 starts 3 bytes after p, so GCC has seen the common suffix of def (including the \0 terminator) and done a similar optimisation to the one you have shown.

(This is an answer because it's too long to be a comment.)

huon
  • 94,605
  • 21
  • 231
  • 225
3

String literals in the code are stored in a read-only data segment of the code. When you write down a string literal like "abc" it actually returns a 'const char*' and if you had all the compiler warnings on it would tell you that you are casting at that point. You are not allowed to alter those strings for the very reason you have pointed out in this question.

Salgar
  • 7,687
  • 1
  • 25
  • 39
2

When you create a string literal ("abc"), it is saved into a memory, which contains string literals, and then it's being reused if you refer to the same string literal, thus both pointers pointing to the same location, where the "abc" string literal is stored.

I've learned this some time ago so I might not have explained it really clearly, sorry.

Lord Zsolt
  • 6,492
  • 9
  • 46
  • 76
2

This actually depends on which compiler you are using.

In my system with TC++ 3.5 it prints two different values for the two pointers i.e. two different addresses.

Your compiler is designed s.t it will check for existence of any value in the memory and depending on its existence it will reassign or use the same reference of the previously stored value if the same value is referred to.

So don't think about it too much as it depends on the way the compiler parses the code.

THAT'S ALL...

Rajesh Paul
  • 6,793
  • 6
  • 40
  • 57
1

because string "abc" itself a address in memory. when u write "abc" again it store same address

SANDEEP
  • 1,062
  • 3
  • 14
  • 32
1

It is compiler optimization but forget optimization for portability. Sometime compiled codes are more readable than actual codes.

Amir Saniyan
  • 13,014
  • 20
  • 92
  • 137
0

you are use string literal,

when complier catch two same string literal,

it give the same memory location, therefore it show same pointer location./

Dev
  • 3,410
  • 4
  • 17
  • 16