31

Let's say I have

std::tuple<T0, T1, T2> my_tuple{x0, x1, x2};

where T0, T1 and T2 are value types (i.e. no aliasing is possible).

Is it safe to access my_tuple's elements and mutate them concurrently from multiple threads using std::get, as long as every thread accesses a different element?

Example:

template <typename T>
void process(T& x) { /* mutate `x` */ }

// ...

std::thread{[&]{ process(std::get<0>(my_tuple)); }}.detach();
std::thread{[&]{ process(std::get<1>(my_tuple)); }}.detach();
std::thread{[&]{ process(std::get<2>(my_tuple)); }}.detach();

Instinctively I would say it is safe, as my_tuple can be thought of as struct { T0 x0; T1 x1; T2 x2; };... but is it guaranteed by the standard?

Vittorio Romeo
  • 90,666
  • 33
  • 258
  • 416
  • A quick scan finds no mention of "synchronization" in "20.4 Tuples". Doesn't appear to be explicitly specified. `std::get` is described merely as "Returns: A reference to the Ith element of t, where indexing is zero-based". That's it. I would agree that this is sufficient to specify thread safety, in those terms. – Sam Varshavchik Nov 28 '16 at 13:46
  • 3
    `std::get` is `constexpr`, so I'd say it can be considered equivalent to direct access, which would be thread-safe as long as the threads access different elements. – StoryTeller - Unslander Monica Nov 28 '16 at 13:49
  • 3
    @SamVarshavchik There is a general rule that accessing data in two different threads without synchronization is undefined behavior; what prevents `std::get` from counting as access to the entire `std::tuple` object? I'm not saying it is, I'm just not convinced the standard provides the guarantee. – Yakk - Adam Nevraumont Nov 28 '16 at 14:04
  • 2
    @Yakk reading from multiple threads is safe (afaik), it's UB if more than zero of threads write. `get` on `my_tuple` doesn't modify `my_tuple` and an argument could be made that eventual modifications would be made to separate subobjects of it. – krzaq Nov 28 '16 at 14:15
  • 4
    @krzaq Is `std::get` "reading", as it takes its argument as a non-`const` reference? I mean, it *should* be reading, but does the standard agree? We might have to fall back on the "the std library doesn't do unnessicary stuff". But for an example, imagine we have a tuple with two empty classes, and the tuple is compressed. Now we have two identical location objects being used in a non-`const` manner. Is the standard ok with that? I am not certain. I think it *should* be, I am uncertain if it *is*. – Yakk - Adam Nevraumont Nov 28 '16 at 14:21
  • It is the same as reading different member variables of one struct from different threads. `std::tuple<>` is just templated version of simple struct. – PiotrNycz Nov 28 '16 at 14:21
  • 1
    I can't believe that the only way to prevent from undefined behavior here is to reimplement `tuple` and `get`... :) – W.F. Nov 28 '16 at 14:24
  • 3
    @W.F. I *suspect* the standard is sane here, but I don't know. I'm just trying to point out that there is a real question here. For containers, the standard is explicit in what requires synchronization (const is mutually thread safe), and even lists explicit non-const methods that are "treated as const" as far as synchronization is concerned. Tuple does not; that might mean that the other parts of the standard are sufficient to guarantee sane behavior, or it may indicate a standard defect. *Regardless*, it is going to be safe to use *in practice*; no compiler is stupid enough to break this. – Yakk - Adam Nevraumont Nov 28 '16 at 14:30
  • 1
    @Yakk I'm not certain either, that's why I'm commenting instead of answering ;) I see your point about "reading" (what if you added const to get's my_tuple and then cast the const away in its result?). But as far as compressed tuple goes, I think aliasing rules would come to the rescue... Unless you want to consider the user casting the object to char* and using it as storage of size 1. I guess I'm just further confusing myself. – krzaq Nov 28 '16 at 14:31
  • 2
    I don't know of anything resembling [\[container.requirements.dataraces\]](https://timsong-cpp.github.io/cppwp/container.requirements.dataraces) for `tuple`, so I'm going to guess that this is formally racy, which sounds like LWG issue material... – T.C. Nov 30 '16 at 08:18

2 Answers2

13

Since std::get has no explicit statements in the specification about its data race properties, we fall back to the default behavior defined in [res.on.data.races]. Specifically, paragraphs 2 and 3 tell the story:

A C++ standard library function shall not directly or indirectly access objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s arguments, including this.

A C ++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.

These provide protection from data races only for uses that are not the same object provided by a function's arguments. A template parameter is not technically a function's arguments, so it doesn't qualify.

Your case involves multiple threads passing the same object to different get calls. Since you are passing a non-const parameter, get will be assumed to be modifying its tuple argument. Therefore, calling get on the same object counts as modifying the object from multiple threads. And therefore, calling it can legally provoke a data race on the tuple.

Even though, technically speaking, it's just extracting a subobject from the tuple and therefore should not disturb the object itself or its other subobjects. The standard does not know this.

However, if the parameter were const, then get would not be considered to provoke a data race with other const calls to get. These would simply be viewing the same object from multiple threads, which is allowed in the standard library. It would provoke a data race with non-const uses of get or with other non-const uses of the tuple object. But not with const uses of it.

So you can "access" them, but not "modify" them.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • I have serious doubts on this answer. `const` doesn't have anything to do with data races. On the C++11 memory model, a data race is defined as two conflicting accesses on the same object or different bitfields of a same object, where neither of the two accesses `happen before` the other. The "conflicting accesses" are two writes or one write and any read. Not a function call. The different members of the tuples are different objects (two subobjects of the tuple). Moreover, your own quote from the standard confutes your answer: the tuple members are accessed "indirectly". – gigabytes Dec 01 '16 at 16:46
  • @gigabytes: But the second paragraph I quoted *explicitly forbids* standard library functions from modifying an object or subobject through a `const` parameter. Therefore, if all of the `get`s of the same object access a `const` tuple, then none of them are modifying it. Since there are no modifying accesses, there can be no "conflicting accesses". And therefore, there are no data races. – Nicol Bolas Dec 01 '16 at 16:53
  • No, it doesn't. You quoted a paragraph from a section that is stating which guarantees you get from a "standard library function". `std::get` is such a function. That paragraph says that you are guaranteed that such a function won't access any object that is accessible by other threads, *unless*, of course, those objects that come to the function via its arguments or `this`. – gigabytes Dec 01 '16 at 16:58
  • So: 1) it is not a requirement on *your* code, but a guarantee that you get from standard library functions. 2) it mentions `const` only because the first paragraphs is about reads ("access") and the second about writes ("modify") so it have to clarify that only non-`const` arguments can be modified anyway. – gigabytes Dec 01 '16 at 16:59
  • @gigabytes: And that's my point. If you have only non-modifying accesses to the tuple object, then you do not have a data race. Note that the OP asked only about races imposed by `get` on its parameter, not imposed by what you do with the return value of `get`. – Nicol Bolas Dec 01 '16 at 17:00
  • Why only non-modifying accesses? The first paragraphs says "shall not" as well. – gigabytes Dec 01 '16 at 17:02
  • Your mistake is treating the tuple as an atomic entity. The objects "accessed" here are the two subobjects. They are different subobjects and the two writes do not interfere with each other. The relevant sections in the standard are other ones. I'll post them tomorrow or in the w.e. – gigabytes Dec 01 '16 at 17:03
  • @gigabytes: I'm not sure I understand your question. My answer is basically, if you thread your `get` calls on a non-`const` tuple object, then a data race can happen. But if all your `get` calls are on `const` tuple objects, then no data race happens, assuming you're not doing something else to the tuple at the same time. – Nicol Bolas Dec 01 '16 at 17:03
  • Yes, a data race *can* happen if you modify the *same* subobject, *and* at least one of these accesses is a write. But the question was if it is safe to access the tuple "as long as every thread accesses a different element". – gigabytes Dec 01 '16 at 17:05
  • 1
    @gigabytes: In order to get a pointer/reference to a subobject of a `tuple`, you must call a standard library function on that tuple object. Therefore, the data race behavior of *getting that subobject* is defined by the standard library's behavior of function calls, as quoted above. If you call the non-`const` `get`, then this call is considered to be *no different* from any other modification of the *entire tuple object*. There is no standard wording to have it treat this call, in terms of data race behavior, as any different from passing a non-`const` tuple to any function. – Nicol Bolas Dec 01 '16 at 17:07
  • 3
    To put it another way, `obj.a` and `obj.b` never provoke a data race. But this is because the language explicitly says so. `func<0>(obj)` and `func<1>(obj)` can provoke a data race, even if `func<0>` just so happens to return `obj.a` and `func<1>` just so happens to return `obj.b`. Now obviously, no implementation would actually fail in this case. But we're talking about what the *standard says*, not how implementations implement it. – Nicol Bolas Dec 01 '16 at 17:11
  • 1
    Still no :) "the data race behavior of getting that subobject is defined by the standard library's behavior of function calls, as quoted above" this is wrong. Again, the quoted parts only say that *you* are guaranteed that a standard library function basically won't access anything other than its arguments. It doesn't define anything as a data race. The definition of data race is in another part of the standard, and you have to look at it (again, I cannot look at it now, I'll post a full answer, I promise). – gigabytes Dec 01 '16 at 17:16
  • 1
    "func<0>(obj) and func<1>(obj) can provoke a data race, even if func<0> just so happens to return obj.a and func<1> just so happens to return obj.b" this also is wrong. These two functions would only make two "read" accesses to the object and two concurrent reads, even if unsequenced, do not cause a data race. (and still, you forgot that 'a' and 'b' are different objects anyway, it doesn't matter they are members of something). And yes, of course, I'm talking the standard, not about implementations. – gigabytes Dec 01 '16 at 17:19
  • @gigabytes: "*These two functions would only make two "read" accesses to the object and two concurrent reads, even if unsequenced, do not cause a data race.*" But the standard does not *guarantee that* they perform "read accesses". Even with the defined behavior of `get`, that does not prevent a valid implementation of `get` from modifying the tuple object it is given. So unless `get` has an explicit statement in the standard exempting it from data race behavior, it *must* be assumed to provoke a race when acting on the same object from multiple threads. – Nicol Bolas Dec 01 '16 at 17:26
  • That doesn't make sense to me. If we follow this reasoning, we cannot do anything with any object. Do you want to read elements in a `std::vector`? No, you can't, why? because the standard does not specify that `operator[]` doesn't write. There must be a loophole, don't you agree? In any case, the two paragraphs you quoted are *not* the answer. – gigabytes Dec 01 '16 at 17:46
  • @gigabytes: "*There must be a loophole, don't you agree?*" No. `vector::operator[]` can provoke a data race, so you cannot call it on the same non-`const` `vector` instance from multiple threads. That's very different from saying that you "cannot do anything with any object". – Nicol Bolas Dec 01 '16 at 18:06
  • I still don't agree but I have to check the actual standard. At least you have to admit that the paragraphs you quoted don't imply your last statement. I think we'd better move everything to the chat and look at the text. – gigabytes Dec 01 '16 at 18:15
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/129565/discussion-between-gigabytes-and-nicol-bolas). – gigabytes Dec 01 '16 at 18:16
  • 3
    @gigabytes Containers like `std::vector` have [special wording w/r/t data races](https://timsong-cpp.github.io/cppwp/container.requirements.dataraces). – T.C. Dec 02 '16 at 03:17
  • @T.C. Ah! That's the caveat I was looking for! – gigabytes Dec 02 '16 at 07:04
  • 2
    @gigabytes tuple is not a container – Eric Lemanissier Dec 02 '16 at 10:03
  • @EricLemanissier yes of course. I mean that's what solves the problem with vector. – gigabytes Dec 02 '16 at 10:07
  • I largely agree with @gigabytes What was quoted in this answer are only how standard library function/types behave (and a good guideline for user library design). But they have nothing to do with user types. Even apparently *read only* access by `get` is not safe for arbitrary types. There's nothing preventing a copy constructor to modify something. It might not modify the object itself but something outside. Of course, it is very bad style and I cannot think of any sane person write something like that without very good reason. `get` does not ensure thread-safety without knowing the types. – Yan Zhou Dec 06 '16 at 01:53
  • 1
    @YanZhou: "*What was quoted in this answer are only how standard library function/types behave (and a good guideline for user library design). But they have nothing to do with user types.*" That's true. But I'm pretty sure the OP was talking specifically about the `std::get` from the standard library, which has nothing to do with user types. Calling `get` on a tuple will not invoke user-provided code. Not unless the user has created a specialization of `get`. – Nicol Bolas Dec 06 '16 at 03:05
  • @NicolBolas Calling `get` may or may not trigger a call to a copy/move constructor, depending on what the user function `process` in the OP's question does. – Yan Zhou Dec 06 '16 at 10:01
  • @YanZhou: Taking the return value of `get` and shoving it into `process` might invoke a copy/move operation, but that is *after* `get` has done its job. Thus, it is irrelevant to the question of `get`'s thread safety. The question, after all, is *not* "what is the thread safety of using the return value of `get`?" – Nicol Bolas Dec 06 '16 at 15:14
  • @NicolBolas Well, I think it is up to the OP to clarify what he is asking exactly. If we narrow it to merely "what was done by `get`", then of course there's no race. All that `get` does is to get a reference. But that seems to be pointless question to me. – Yan Zhou Dec 06 '16 at 16:51
  • @YanZhou: I want to mutate the value returned by `std::get` *(i.e. mutate it in-place in the tuple, not make a copy)*. – Vittorio Romeo Dec 07 '16 at 16:34
  • @VittorioRomeo: If the two objects are truly separate and distinct, such that the modifications on these objects do not impose a data race, then the only potential race can come from `get`ing them from the tuple. – Nicol Bolas Dec 07 '16 at 17:31
-2

The short answer is that it depends on the types and what does process do instead of get. By itself, get merely retrieve the address of the object and return it as a reference. Retrieving the address is mostly just reading the contents of integers. It does not raise race conditions. Roughly speaking, the code snippet in your question is thread-safe if and only if the following is thread-safe,

T1 t1;
T2 t2;
T3 t3;

std::thread{[&]{process(t1);}}.detach();
std::thread{[&]{process(t2);}}.detach();
std::thread{[&]{process(t3);}}.detach();
Yan Zhou
  • 2,709
  • 2
  • 22
  • 37
  • Can you prove it? The whole point is I don't think there's any guarantee in the standard that mutating different references returned by `std::tuple::get` in different threads is thread-safe. It most likely is for all major implementations of `std::tuple`, but is it formally guaranteed by the standard? – Vittorio Romeo Dec 07 '16 at 16:49
  • @VittorioRomeo The point is, it has nothing to do with `get` or `tuple` for that matter. An object's address does not change once it is created. One might mistakenly write into that address, but one cannot change that address itself. And thus by itself `get` is safe. – Yan Zhou Dec 07 '16 at 16:57
  • `get` on the same tuple is not thread safe. There is nothing in the standard which requires implementations to implement `get` on the same tuple for different objects in a thread-safe way. – Nicol Bolas Dec 07 '16 at 17:32