22

Derived from this question and related to this question:

If I construct an object in one thread and then convey a reference/pointer to it to another thread, is it thread un-safe for that other thread to access the object without explicit locking/memory-barriers?

// thread 1
Obj obj;

anyLeagalTransferDevice.Send(&obj);
while(1); // never let obj go out of scope

// thread 2
anyLeagalTransferDevice.Get()->SomeFn();

Alternatively: is there any legal way to convey data between threads that doesn't enforce memory ordering with regards to everything else the thread has touched? From a hardware standpoint I don't see any reason it shouldn't be possible.

To clarify; the question is with regards to cache coherency, memory ordering and whatnot. Can Thread 2 get and use the pointer before Thread 2's view of memory includes the writes involved in constructing obj? To miss-quote Alexandrescu(?) "Could a malicious CPU designer and compiler writer collude to build a standard conforming system that make that break?"

Community
  • 1
  • 1
BCS
  • 75,627
  • 68
  • 187
  • 294
  • 1
    You might just want to read the chapter in the C++ standard on the memory model; 1.7 [intro.memory]. – bames53 Apr 20 '12 at 17:51
  • 2
    I'm not sure I understand the question based on the title. It's impossible to convey a reference/pointer to an object to another thread until the constructor is finished, unless it's done within the constructor itself. – Mark Ransom Apr 20 '12 at 17:51
  • 1
    I didn't get your question very well. Your code won't lead to inconsistency because you are using the object from one thread only (a different thread from it was created, but the address space is shared between threads, so there is no problem here). Thread safety problems arise when you can not guarantee the object will be used by only one thread at a time (what is usually the case with code bigger than 4 lines), thus the need to manually enforce the serialization with mutexes and stuff. – lvella Apr 20 '12 at 17:51
  • @lvella: the object is used in two threads, the first thread "uses" it while constructing it. – BCS Apr 20 '12 at 17:56
  • For those unsure of why BCS is asking this (who may not have fully read his first link), it has to do with [Vlad's answer and patros's comment](http://stackoverflow.com/a/10250366/1287251). – Cornstalks Apr 20 '12 at 17:58
  • Memory read/writes are not reordered across function boundaries. Since a constructor is kind of a function, once new returns the address of the instance all write operations have been executed. – Fozi Apr 20 '12 at 20:33
  • @Fozi: Does that also apply to memory as seen by other threads? -- Can the compiler inline a constructor (note I don't have the code call `new`) and not insert a memory barrier? – BCS Apr 20 '12 at 20:49
  • @BCS Don't confuse reordering by the compiler and CPU cache. The CPU caches are coherent between the CPUs (and therefore threads) so you don't have to worry about this. As for the reordering, I doubt a construction will be in-lined even in the most aggressive optimization settings. And even if it is, the compiler will have to make sure it doesn't reorder across function boundaries, because it's a requirement. – Fozi Apr 23 '12 at 14:28

5 Answers5

17

Reasoning about thread-safety can be difficult, and I am no expert on the C++11 memory model. Fortunately, however, your example is very simple. I rewrite the example, because the constructor is irrelevant.

Simplified Example

Question: Is the following code correct? Or can the execution result in undefined behavior?

// Legal transfer of pointer to int without data race.
// The receive function blocks until send is called.
void send(int*);
int* receive();

// --- thread A ---
/* A1 */   int* pointer = receive();
/* A2 */   int answer = *pointer;

// --- thread B ---
           int answer;
/* B1 */   answer = 42;
/* B2 */   send(&answer);
           // wait forever

Answer: There may be a data race on the memory location of answer, and thus the execution results in undefined behavior. See below for details.


Implementation of Data Transfer

Of course, the answer depends on the possible and legal implementations of the functions send and receive. I use the following data-race-free implementation. Note that only a single atomic variable is used, and all memory operations use std::memory_order_relaxed. Basically this means, that these functions do not restrict memory re-orderings.

std::atomic<int*> transfer{nullptr};

void send(int* pointer) {
    transfer.store(pointer, std::memory_order_relaxed);
}

int* receive() {
    while (transfer.load(std::memory_order_relaxed) == nullptr) { }
    return transfer.load(std::memory_order_relaxed);
}

Order of Memory Operations

On multicore systems, a thread can see memory changes in a different order as what other threads see. In addition, both compilers and CPUs may reorder memory operations within a single thread for efficiency - and they do this all the time. Atomic operations with std::memory_order_relaxed do not participate in any synchronization and do not impose any ordering.

In the above example, the compiler is allowed to reorder the operations of thread B, and execute B2 before B1, because the reordering has no effect on the thread itself.

// --- valid execution of operations in thread B ---
           int answer;
/* B2 */   send(&answer);
/* B1 */   answer = 42;
           // wait forever

Data Race

C++11 defines a data race as follows (N3290 C++11 Draft): "The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior." And the term happens before is defined earlier in the same document.

In the above example, B1 and A2 are conflicting and non-atomic operations, and neither happens before the other. This is obvious, because I have shown in the previous section, that both can happen at the same time.

That's the only thing that matters in C++11. In contrast, the Java Memory Model also tries to define the behavior if there are data races, and it took them almost a decade to come up with a reasonable specification. C++11 didn't make the same mistake.


Further Information

I'm a bit surprised that these basics are not well known. The definitive source of information is the section Multi-threaded executions and data races in the C++11 standard. However, the specification is difficult to understand.

A good starting point are Hans Boehm's talks - e.g. available as online videos:

There are also a lot of other good resources, I have mentioned elsewhere, e.g.:

nosid
  • 48,932
  • 13
  • 112
  • 139
  • 1
    The memory stores in thread 2 can happen after the atomic write, after thread 1 sees the update of the atomic variable, and after thread one reads data from the referred object. – nosid Apr 20 '12 at 20:21
  • @nosid: A (data) race just means accessing a variable while another is writing to it. You can't get a data race with an atomic variable. What you've described is just the effects of no ordering. – GManNickG Apr 20 '12 at 21:18
  • @GManNickG: There is no race condition with the atomic variable. There is a race condition with the memory locations, that are not atomic, because the operations performed on these locations can be reordered - e.g. writes moved after the write of the atomic variable. – nosid Apr 20 '12 at 21:35
  • @nosid: What I'm saying is that's not the correct term. There's no "race" there, there' just no ordering. (There is no undefined behavior, just unspecified behavior. Data races cause undefined behavior.) – GManNickG Apr 20 '12 at 21:44
  • 1
    @GManNickG: From n3290 (C++11 draft): "The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior." *Happens before* is defined earlier. And in my last example, neither B1 happens before A2 nor A2 happens before B1. So, there is a *data race*, and this implies *undefined behavior*. – nosid Apr 20 '12 at 21:53
  • 1
    @nosid: Did you miss "*at least one of which is not atomic*"? – GManNickG Apr 21 '12 at 18:55
  • @GManNickG neither A2 nor B1 is atomic, I think they are un-ordered w.r.t. each other. So that leave "are they conflicting actions?". I could see an argument for "no" but as long as constructors don't play into the reorder rules, I think this answers my original question; you can (in theory) get dirty reads of a un-finished object. – BCS Apr 23 '12 at 14:49
  • 1
    @BCS: Firstly, nosid changed his answer quite a bit since our discussion first started, so what I'm saying only applies to the sues of `std::atomic`, which was all the original answer had. And operations on `std::atomic` are *always* atomic, so his quoted sentence simply never applies. – GManNickG Apr 23 '12 at 16:58
3

There is no parallel access to the same data, so there is no problem:

  • Thread 1 starts execution of Obj::Obj().
  • Thread 1 finishes execution of Obj::Obj().
  • Thread 1 passes reference to the memory occupied by obj to thread 2.
  • Thread 1 never does anything else with that memory (soon after, it falls into infinite loop).
  • Thread 2 picks-up the reference to memory occupied by obj.
  • Thread 2 presumably does something with it, undisturbed by thread 1 which is still infinitely looping.

The only potential problem is if Send didn't acts as a memory barrier, but then it wouldn't really be a "legal transfer device".

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • The question is: can the memory writes from the first two steps in thread 1 get ordered (from the perspective of thread 2) after the actions needed for picking up the reference in thread 2? – BCS Apr 20 '12 at 18:01
  • 2
    @BCS `Send` cannot be ordered before `Obj::Obj()` finishes. Assuming `Send` is memory barrier, the thread 2 will pick up the "complete picture" of the effects of these two calls. – Branko Dimitrijevic Apr 20 '12 at 18:07
  • 1
    @BCS `Obj` is not a global object, so the answer is generally no. By the time that thread 2 gets the pointer to `Obj` in thread 1, `Obj` has been fully constructed in memory, specifically the stack of thread 1. If both thread 1 and thread 2 could see `Obj` at the same time because it was in some globally accessible memory location, or for some reason the stack of thread 1 was cached among multiple processors, then there could be a potential ordering issue, but I highly doubt that thread 1's stack would be sitting in cache. – Jason Apr 20 '12 at 18:07
  • @BrankoDimitrijevic: then the question becomes: Can `Send` (and `Get`) not include a memory barrier? Or not include a memory barrier that involves obj? – BCS Apr 20 '12 at 18:11
  • 1
    @BCS Generally no. But whenever you lock (which you probably already do in the `Send`'s implementation), this is also a memory barrier. In fact, you'd have to work very hard to devise a lock-free algorithm for `Send`, which could then potentially present a non-consistent view into `obj`'s memory to thread 2. In other words, it's very hard to make `Send` work without also making `obj` "work". – Branko Dimitrijevic Apr 20 '12 at 18:15
  • As I have shown in my answer, it is actually quite easy to transfer the pointer without a memory barrier. Your assumptions and conclusions are wrong. – nosid Apr 21 '12 at 23:20
  • @BrankoDimitrijevic: I am referring to "it's very hard to make `Send` work without also making `obj` work.". It isn't at all. – nosid Apr 22 '12 at 18:19
  • 1
    @nosid Ah that. First of all, this is not the main thrust of my answer at all. Second, your implementation may be a memory barrier after all (depending on `std::atomic::is_lock_free` and behavior of locks on your platform). Third, we could argue what it means "hard" - as you have shown, it is not hard to make a toy implementation with busy waiting and capacity of 1, but it is another matter entirely to make a mechanism that would be useful in realistic programs. And finally, even if it were easy, this is by no means obvious or expected to the client and would have to be carefully documented. – Branko Dimitrijevic Apr 23 '12 at 09:04
2

As others have alluded to, the only way in which a constructor is not thread-safe is if something somehow gets a pointer or reference to it before the constructor is finished, and the only way that would occur is if the constructor itself has code that registers the this pointer to some type of container which is shared across threads.

Now in your specific example, Branko Dimitrijevic gave a good complete explanation how your case is fine. But in the general case, I'd say to not use something until the constructor is finished, though I don't think there's anything "special" that doesn't happen until the constructor is finished. By the time it enters the (last) constructor in an inheritance chain, the object is pretty much fully "good to go" with all of its member variables being initialized, etc. So no worse than any other critical section work, but another thread would need to know about it first, and the only way that happens is if you're sharing this in the constructor itself somehow. So only do that as the "last thing" if you are.

Community
  • 1
  • 1
Kevin Anderson
  • 6,850
  • 4
  • 32
  • 54
  • I'm trying hard to remember when I last signaled a 'this' instance to another thread inside a ctor... Now I'm trying to think of some circumstance where I might want to do it.... Now I'm trying to think of some way to do it that would avoid a synchro primitive enforcing a memory barrier.... Nope, I've got nothing! – Martin James Apr 20 '12 at 21:04
  • The only thing I can think of is some type of global collection of an object type that every instance of an object registers itself with upon construction (rather than whatever that creates it adding itself to). And then another thread is enumerating that while the first is constructing it, and thus "catches" it as part of the collection after it adds itself, but before the constructor is technically done. So possible I guess. – Kevin Anderson Apr 20 '12 at 21:40
  • Well, OK, +1, you are in the running for the 2012 'Most Unreasonable and Unlikely Design Ever Conjured Up in an Attempt to Demonstrate a Multithreaded Design Verging-On-A-Non-Issue' award. You could win the original source code to Windows ME and a Toyota Yaris Seriously though, how do such issues ever come up? Should I too be thinking of grossly-unlikely scenarios to further my career? – Martin James Apr 20 '12 at 22:40
  • How about finding a codebase using a #define statement to override a keyword? I saw that last week. Bad design happens. Often you are called in to deal with it. Eventually you can imagine how it even came about. I dunno if it's a blessing or a curse that I can imagine such. – Kevin Anderson Apr 21 '12 at 06:32
  • 1
    'using a #define statement to override a keyword' - what is his/her name, and where do they live? – Martin James Apr 21 '12 at 10:03
  • The compiler or hardware might reorder the memory operations. In this case the share-this operation might happen at any point during the constructor. The memory model does not prohibit this, since fields and this are different memory locations. – qznc Apr 22 '16 at 09:24
1

It is only safe (sort of) if you wrote both threads, and know the first thread is not accessing it while the second thread is. For example, if the thread constructing it never accesses it after passing the reference/pointer, you would be OK. Otherwise it is thread unsafe. You could change that by making all methods that access data members (read or write) lock memory.

DRVic
  • 2,481
  • 1
  • 15
  • 22
  • In *Java* as long as the `this` is "not leaked" the constructor is thread safe (then again, it has it's own memory model). In C++, is there any such memory barrier guarantee? Even if not, is it guaranteed that (without leaking a `this`, and assuming no use of shared state) the new object can be safely "created" from multiple threads? –  Apr 20 '12 at 17:50
  • I'm not sure I understand the question. If you're asking whether new is thread safe, yes, it has to be or all hell would break loose. If you're asking whether objects created on the stack are accessible to other threads before the constructor is finished, no, they are not. – DRVic Apr 20 '12 at 17:57
  • BTW, I'm specifically NOT asking about `new` thus using a stack allocated object. – BCS Apr 20 '12 at 18:02
  • actually this answer is not entirely true, only in a naive way. The constructor of the object might be finished, but if there is no memory barrier the second thread (running on a different core) might never see the actual effect of the constructor (its changes might still be in its core's local cache and never make it to the other thread). The real problem here is that the data race might not be obvious. While the constructor is done when the reference might be passed, the accessing thread might not have gotten to see its effect. On a single core, everything looks fine. – PeterSom Apr 29 '12 at 15:02
0

Read this question until now... Still will post my comments:

Static Local Variable

There is a reliable way to construct objects when you are in a multi-thread environment, that is using a static local variable (static local variable-CppCoreGuidelines),

From the above reference: "This is one of the most effective solutions to problems related to initialization order. In a multi-threaded environment the initialization of the static object does not introduce a race condition (unless you carelessly access a shared object from within its constructor)."

Also note from the reference, if the destruction of X involves an operation that needs to be synchronized you can create the object on the heap and synchronize when to call the destructor.

Below is an example I wrote to show the Construct On First Use Idiom, which is basically what the reference talks about.

#include <iostream>
#include <thread>
#include <vector>

class ThreadConstruct
{
public:
    ThreadConstruct(int a, float b) : _a{a}, _b{b}
    {
        std::cout << "ThreadConstruct construct start" << std::endl;
        std::this_thread::sleep_for(std::chrono::seconds(2));
        std::cout << "ThreadConstruct construct end" << std::endl;
    }

    void get()
    {
        std::cout << _a << " " << _b << std::endl;
    }

private:
    int _a;
    float _b;
};


struct Factory
{
    template<class T, typename ...ARGS>
    static T& get(ARGS... args)
    {
        //thread safe object instantiation
        static T instance(std::forward<ARGS>(args)...);
        return instance;
    }
};

//thread pool
class Threads
{
public:
    Threads() 
    {
        for (size_t num_threads = 0; num_threads < 5; ++num_threads) {
            thread_pool.emplace_back(&Threads::run, this);
        }
    }

    void run()
    {
        //thread safe constructor call
        ThreadConstruct& thread_construct = Factory::get<ThreadConstruct>(5, 10.1);
        thread_construct.get();
    }

    ~Threads() 
    {
        for(auto& x : thread_pool) {
            if(x.joinable()) {
                x.join();
            }
        }
    }

private:
    std::vector<std::thread> thread_pool;
};


int main()
{
    Threads thread;

    return 0;
}

Output:

ThreadConstruct construct start
ThreadConstruct construct end
5 10.1
5 10.1
5 10.1
5 10.1
5 10.1
Community
  • 1
  • 1
AdvSphere
  • 986
  • 7
  • 15