12

I've seen various conflicting references to the copy constructor behaviour of STL strings in C++ and I was hoping someone could clarify this for me, given the following code segment:

string str() { return string("this is a string"); }
//meanwhile, in some other function...
string s = str();

Does the object 's' constitute a deep copy of the string object defined in the function 'str()'? or is the object 's' simply pointing at the same chunk of memory allocated during the string constructor call in the 'str()' function?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Gearoid Murphy
  • 11,834
  • 17
  • 68
  • 86
  • It doesn't matter what the exact copying semantics are, because any sane compiler will optimize all copies away in the above example. And if for some reason that isn't possible, move semantics are required to kick in. – fredoverflow Dec 11 '11 at 20:09

2 Answers2

7

String will deep copy, they do not shared the same buffer.

That said when returning them from a function most good compilers can either use Return Value Optimisation or Copy elision so that manoeuvre isn't all that expensive (or even free).

If you are using c++11 then move semantics are specified by the standard so for thing like return string rest assured that the worst case (even without optimisations) is fairly cheap.

EDIT: to summarise, you are guaranteed that the string you "own" will have a unique chunk of memory which will persist for at least the life time of the local string. However it is more than likely that the compiler won't copy it from the string in the function but rather just swap it's pointers or even elided the copy altogether (meaning the string in the function would actually be the string you assign too).

111111
  • 15,686
  • 6
  • 47
  • 62
  • Wikipedia :), I should have known, how can I tell if this mechanism is being used in g++? – Gearoid Murphy Dec 11 '11 at 20:12
  • 1
    Without looking at the ASM I don't think that you can. But like I said move semantics are a part of the standard and move constructor/assingment operator will/should be defined for you std-lib and these will be used if you are using c++11 – 111111 Dec 11 '11 at 20:15
  • Where is this guarantee that each string owns a unique buffer? Are copy-on-write implementations not allowed? – visitor Dec 12 '11 at 08:36
  • COW is not detailed by the standard, however whether or not your implementation uses it is unknown. clang for example does have some COW for containers and strings whether this interfere with Copy elision I do not know. Perhaps saying that each string has a unique buffer is a bit definite but I think it is safe to think about it that way. – 111111 Dec 13 '11 at 14:28
  • @111111 COW was explicitly designed for in the 1997 C++ standard; `std::string` has funny (= broken) invalidation semantics because of it. COW is not allowed for "true" STL containers. COW string is a broken idea, as everyone became aware only too late. – curiousguy Dec 15 '11 at 07:01
  • The permission to invalidate iterators to string during COW does not exist in N3242: "_References, pointers, and iterators referring to the elements of a basic_string sequence may be invalidated: () Calling non-const member functions, **except operator[], at, front, back, begin, rbegin, end, and rend**." and the functions listed here are all `noexcept`. The COW crazy-ness was fixed. – curiousguy Dec 15 '11 at 07:05
1

Yes, it performs a logical deep copy.

From N3126, 21.4.2, Table 61:

data() - points at the first element of an allocated copy of the array whose first element is pointed at by str.data()

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680