std string should crash but doesn't

Question

I have a class:

class A {
  public:
  string B;
};

and then a code:

A a1;
a1.B = "abc";

printf("%p.\n", a1.B.c_str());

A a2(a1);

printf("%p.\n", a2.B.c_str());

c_str's of both instances refer to same place (this I understand, copy constructor copied A bit-by-bit, and string internally stores data in char*, and pointer got copied.

but the question is, why doesn't this code crash? a1 and a2 are stack variables, when desconstructing them string B's will also get deconstructed, won't internal char* of those strings (that point to same memory location) get deleted twice? isn't it double delete, which should cause crash? btw I disabled gcc optimizations and valgrind doesn't show anything as well.

"Should crash" in C++ is almost always the wrong way to think about it. In this case, however, the problem you think the code has is taken care of by the string class. — chris, May 20 '16 at 14:40

Mark Ransom · Accepted Answer · 2016-05-20T14:48:33.440

13

No, the pointer did not get copied. The copy constructor of std::string creates a new buffer and copies the data from the buffer of the other string.

Edit: the C++ standard used to allow copy-on-write semantics, which would share the pointer (and would require reference counting to go along with it), but this was disallowed starting with C++11. Apparently there were versions of GCC which did this.

edited May 20 '16 at 14:48

answered May 20 '16 at 14:40

Mark Ransom

299,747
42
398
622

first of all, does copy constructor of B get called in this case? and second, why do c_str()'s show the same address then? – Zhani Baramidze May 20 '16 at 14:42
1

@JaniBaramidze 1) Yes 2) A gcc optimization that is illegal in C++11 (namely *"Copy on write"*) and was removed in gcc5. – Baum mit Augen May 20 '16 at 14:45
It's wrong, the buffer is the same as long as no modification is done on the string. – Caduchon May 20 '16 at 14:46
@Mark so you are saying that specifying -std=c++11 should make c_str()'s same? it didn't. – Zhani Baramidze May 20 '16 at 14:52
@JaniBaramidze Mark's saying that for C++11 they must be different. But for C++03 they *might* be the same. Either way it doesn't matter. – Roddy May 20 '16 at 14:57
damn, wanted to say vice versa, adding -std=c++11 should make c_str()'s DIFFERENT? since copy-on-write is disabled there. and, specifying -std=c++11 they are still the same – Zhani Baramidze May 20 '16 at 15:00
1

@JaniBaramidze GCC started to adhere to standard requirements for strings in C++11 starting from GCC 5. Before that it still used illegal optimisation in C++11 mode. – Revolver_Ocelot May 20 '16 at 15:22

score 3 · Answer 2 · answered May 20 '16 at 14:44

For GCC 4.*

There is an internal counter in the string class, to know the number of instances pointing to the buffer. When the counter is turned to 0, the instance has the responsability to free the memory. It's the same behaviour than shared pointer (boost or C++11).

Moreover, when the string is modified, then a new buffer is allocated to avoid the modification on the other instances sharing the buffer.

score 2 · Answer 3 · edited May 23 '17 at 12:13

should crash but doesn't

This statement should be taken with a grain of salt. C++ has no concept of "must crash". It has a concept of undefined behaviour, which may or may not result in crashes. Even so, your code has no undefined behaviour.

c_str's of both instances refer to same place (this I understand, copy constructor copied A bit-by-bit, and string internally stores data in char*, and pointer got copied.

You are talking about the implementation of std::string. You must must instead look at its interface in order to decide which operations are safe and which aren't.

Other than that, the implementation you are talking about, called copy-on-write or "COW", is obsolete since C++11. Latest GCC versions have abandoned it.

See GCC 5 Changes, New Features, and Fixes:

A new implementation of std::string is enabled by default, using the small string optimization instead of copy-on-write reference counting.

Small-string optimisation is the same technique used also, for example, in the Visual C++ implementation of std::string. It works in a completely different way, so your understanding of how std::string works on the inside is no longer correct if you use a sufficiently new GCC version, or it has never been correct if you use Visual C++.

but the question is, why doesn't this code crash?

Because it uses std::string operations correctly according to the documentation of its interface and because your compiler is not completely broken.

You are basically asking why your compiler produces a working binary for correct code.

a1 and a2 are stack variables,

Yes (the correct term would be that the objects have "automatic storage duration").

when desconstructing them string B's will also get deconstructed, won't internal char* of those strings (that point to same memory location) get deleted twice?

Your compiler's std::string implementation makes sure that this does not happen. Either it doesn't use COW at all, or the destructor contains code that checks if the shared buffer was already deleted.

If you are using an older GCC version, then you can just look at the source code of your std::string implementation to find out how exactly it's done. It's open source, after all -- but beware, for it might look a bit scary. For example, here's the destructor code for an older GCC version:

~basic_string()
{ _M_rep()->_M_dispose(this->get_allocator()); }

Then look at _M_dispose (in the same file) and you'll see that it's a very complicated implementation with various checks and synchronisations.

Also consider this:

If the sheer act of copying a std::string would result in crashes, then the whole class would be completely pointless, wouldn't it?

score -2 · Answer 4 · answered May 20 '16 at 14:42

-2

It doesn't crash because string copy actually duplicates the string, so both strings will point to different memory locations with same data.

answered May 20 '16 at 14:42

Vikash Kesarwani

850
5
14

std string should crash but doesn't

4 Answers4