16

I've known for a while that GCC uses COW (Copy-On-Write) for std::string, making it impossible to use std::string in a multi-threaded program. But as far as I know C++11 prohibits an implementation from using COW, because threads are now defined by the standard, and move semantics pretty much obsolete the need for COW anyway.

Now, GCC 4.6 implements a great deal of the C++11 standard. Yet it seems that the implementation is still using COW semantics. This was brought to my attention by randomly occurring mysterious seg-faults in a multi-threaded application I wrote. I've confirmed this is, in fact, a COW issue via the following test code:

#include <iostream>
#include <string>
#include <cassert>
#include <thread>
using namespace std;

int main()
{
    std::string orig = "abc";
    std::string copy = orig;
    std::cout << (void*) orig.data() << ", " << (void*) copy.data() << endl;
    assert(orig.data() == copy.data());
}


Edit: Note the inclusion of the <thread> header here, proving this is a C++11 program. And here's a link to ideone confirming what I'm saying, (at least for GCC 4.5.1 which ideone uses)

I don't remember why, but for some reason I was under the impression that the std=c++0x flag would eliminate the COW semantics, but it doesn't. The assertion in the above code is successful, even with the --std=c++0x flag. So basically, as of GCC 4.6, std::string is still unusable in a multi-threaded application.

Is there any way to disable COW semantics? Or do I need to use std::vector<char> for now until GCC fixes this?

Channel72
  • 24,139
  • 32
  • 108
  • 180
  • 6
    Impossible to use in a multi-threaded program? Certainly not. Simply don’t write to a string from multiple threads, that’s a (very) bad idea anyway. (And yes, it’s slightly more complex than that, but not much.) – Konrad Rudolph Sep 14 '12 at 19:13
  • 3
    @KonradRudolph the idea is that is should be possible to write to a **copy** of the string independently. – usr Sep 14 '12 at 19:15
  • @usr As long as it’s a thread-local copy, that should work, no? – Konrad Rudolph Sep 14 '12 at 19:17
  • 4
    @KonradRudolph - if you read from a string that's changing in another thread, you still end up in trouble because the state of a complex object is changing while you watch, leading to consequences dependent on the exact implementation. You'd really need to avoid writing to it at all while more than one thread has it. – Michael Kohne Sep 14 '12 at 19:19
  • 4
    I think the OPs point is that GCC is violating the spec. His code should just work. – usr Sep 14 '12 at 19:33
  • 1
    AFAIK Gnu uses atomic operations for reference counting in std::string. This should be enough for thread safety. Can you describe scenario when you believe this safety can be broken? – PiotrNycz Sep 14 '12 at 20:25
  • 1
    @KonradRudolph : The C++11 adopted the n2534 recommendations *in toto* ( http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html ). Copy-on-write is disallowed in C++11 precisely because of the problems Channel72 has encountered. Unfortunately, it appears that this change will mandate a change to not just `` but also to the ABI. – David Hammen Sep 14 '12 at 20:46

1 Answers1

7

If you're going to pass a string across a thread boundary, do an explicit copy, in order to force it to be an independent string, then pass that across:

std::string a="bob";
std::string b(a.data(), a.length());

It's annoying to have to do this at all spots where things cross threads, but in my opinion it's easier than vector<char>.

Michael Kohne
  • 11,888
  • 3
  • 47
  • 79
  • 6
    Yes, this is an obvious workaround. But what about the more fundamental question: Does GCC obey the standard in this case? Shouldn't the OPs code just work? He says COW is not allowed. – usr Sep 14 '12 at 19:28
  • 1
    @usr: The question is what version of GCC and in what mode. COW is perfectly valid in C++03, so if the GCC version is pre-C++11 or if `-std=c++11` is not passed to the compiler then it is perfectly conforming. Note: I don't know if gcc still uses COW in any case or in C++11 mode... would have to check with the code. – David Rodríguez - dribeas Sep 14 '12 at 20:25
  • 2
    The OP says he passed in `--std=c++0x`- – usr Sep 14 '12 at 20:28