4
const char ca[]="test";
const char cb[]="test";

const char *pca="test"; //(*)
const char *pcb="test";

cout<<(ca==cb)<<endl; //0
cout<<(pca==pcb)<<endl; //1

I'm quite confused about this result.

ca cb are different objects, when dealing with ca==cb, it compares the address of the first elements, so &ca[0]!=&cb[0].

But I don't understand the behaviour of pca pcb.

Does (*) equivalent to

const char temp[]="test";
const char *pca = temp;

?

Is "test" an object? and both pca and pcb points to the first element of this object?

Lookout
  • 241
  • 3
  • 10
  • 3
    Compilers are allowed, but not required to combine storage for equal string literals. [Why do (only) some compilers use the same address for identical string literals?](https://stackoverflow.com/q/52814457/3309790) – songyuanyao Jan 20 '21 at 04:06

1 Answers1

1
const char ca[]="test";
const char cb[]="test";

This creates new variables called ca and cb, big enough to hold the characters and terminator from the string literal given in the initialisation (not assignment), then copies the contents of the literal string into each.


const char *pca="test"; //(*)
const char *pcb="test";

This simply sets the character pointers to point to the given string literals. An implementation is free to use the same memory for those literals, effectively aliasing the memory for multiple purposes. It can do this with impunity since it's undefined behaviour to attempt to modify string literals anyway.


Regarding your final point about the pointers being equivalent to:

const char temp[]="test";
const char *pca = temp;

That is not the case. The pca variable now points to memory that you are allowed to modify (through either temp or pca).


Also keep in mind this string literal aliasing can be more subtle than first thought. The code segment:

cout << a << "is " << "invalid\n";
cout << b << "is " << "valid\n";

may only end up with two string literals stored in memory:

Addr:
  0   1     2      3   4   5   6   7   8   9  10   11   12
+---+---+-------+----+---+---+---+---+---+---+---+----+----+
| i | s | <spc> | \0 | i | n | v | a | l | i | d | \n | \0 |
+---+---+-------+----+---+---+---+---+---+---+---+----+----+

with:

  • is<spc> being converted to use address 0;
  • invalid\n being converted to use address 4; and
  • valid\n being converted to use address 6.

And, if some other piece of code ever needed id\n, it could just use address 9.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953