13

The following block of code:

  1. Is technically invalid, since std::get<>() is not thread safe. Reference: Is using `std::get<I>` on a `std::tuple` guaranteed to be thread-safe for different values of `I`?

  2. As far as I can tell, is effectively safe on all implentations of std::tuple<> in the wild right now, and the foreseeable future.

    #include <tuple>
    #include <atomic>
    #include <thread>

    // Out of my control
    using data_t = std::tuple<int, int, int, int>;
    void foo(data_t); 
    //

    int main() {
        data_t landing;
        std::atomic<int> completed = 0;

        // Whichever thread pings last will be the one performing foo()
        auto ping = [&](){
            if(++completed == 4) {
                foo(landing);
            }
        };

        std::thread a([&](){ std::get<0>(landing) = 1; ping(); });
        std::thread b([&](){ std::get<1>(landing) = 2; ping(); });
        std::thread c([&](){ std::get<2>(landing) = 3; ping(); });
        std::thread d([&](){ std::get<3>(landing) = 4; ping(); });

        a.join();
        b.join();
        c.join();
        d.join();

        return 0;
    }

To make matters even more fun, the actual code in question is chock-full of variadic templates, so writing a one-shot landing pad struct to handle that one scenario is not going to cut it. It has to be a general solution.

My current options are:

  • Effectively re-implement std::tuple<> with a reworded std::get<> documentation, which is a waste of time and a waste of code.
  • Push a proposal for std::get<>(std::tuple) to provide guarantees similar to std::vector<>, and document the fact that the code is only valid as of a yet unreleased verion of the standard.
  • ignore the issue, and rely on the fact that realistically, this will almost certainly always work.

None of those are particularly great in the short run... So my questions are:

  • Did I miss something that invalidates point #2?
  • Is there a better workaround that would allow the implementation to be technically valid while not having to support an excessive amount of extra code.
  • Any other opinions on the subject are welcome.
  • Why all are terrible? In my view, point 2 is a way to go - push the proposal, once accepted, use. On a side note, I do not see what this question adds to already mentioned one. – SergeyA Jun 07 '19 at 16:09
  • 1
    @SergeyA because avoiding UB is generally a good idea? –  Jun 07 '19 at 16:10
  • 3
    Perfect opportunity to write your first proposal to the standards committee! – Jesper Juhl Jun 07 '19 at 16:23
  • @Frank the same notion expressed in an answer just earned 5 upvotes, so you might reconsider it being terrible idea. – SergeyA Jun 07 '19 at 17:27
  • @SergeyA Yes, I misworded that, and corrected already. I meant that it's less than great in the short run because it leaves the code base with UB for a possibly extended period of time. Thanks for calling me out on that. –  Jun 07 '19 at 17:29
  • @Frank I missed that you have edited the question. But even than, as long as you are aware of the UB, and you have reasons to assume your platform behaves correctly even though it is UB, I believe it's ok to proceed. – SergeyA Jun 07 '19 at 17:32
  • 1
    _"and document the fact that the code is only valid as of a yet unreleased verion of the standard"_ — You could also think of raising a [**defect**](https://cplusplus.github.io/LWG/lwg-active.html) against the Standard Library. Some defects have resolutions which are retroactively applied (e.g. IIRC LWG2101). – peppe Jun 08 '19 at 10:14

2 Answers2

10

Push a proposal for std::get<>(std::tuple) to provide guarantees similar to std::vector<>, and document the fact that the code is only valid as of a yet unreleased version of the the standard.

I think this is the way to go, as it provides value for the entire C++ community and should not be a burden on implementers. It is also an excellent opportunity to write your first proposal.

I suggest doing that, and for now assuming that this will work, even though it is UB. If your software is super-critical (e.g. flight control system) and you want to be 100% sure that you are not relying on something that could break in the future... then implement your own tuple.

Vittorio Romeo
  • 90,666
  • 33
  • 258
  • 416
  • 6
    Note that this should be extended to other uses of `get` defined by the standard library. `array`, and `pair` should have the same protections. Indeed, it might be good to say that, if a type is decomposable (usable in structured bindings), then the user code invoked must not provoke a data race when accessing distinct indices. – Nicol Bolas Jun 07 '19 at 16:31
  • @NicolBolas I'm not so sure about that last point. I could see scenarios where you would want different indices to alias the same data. A lazy-evaluated vector swizzle operation for example. –  Jun 07 '19 at 17:25
  • @Frank that doesn't really matter. The get operation itself should be thread safe, as well as accessing the **elements** of the array created from tuple. The elements themselves might not be necessarily thread-safe - like pointers could point to the same object, and thus writing through the pointer is not thread safe. – SergeyA Jun 07 '19 at 17:34
  • @SergeyA I'm saying that I can conceive of a type `T` where `float& std::get<0>(T&)` and `float& std::get<1>(T&)` return two references to the same float object. If you think that this should be fine, then the "when accessing distinct indices" part of NicolBolas' suggestion might as well be dropped. –  Jun 07 '19 at 17:38
  • @Frank: "*A lazy-evaluated vector swizzle operation for example.*" GLSL's swizzle mask explicitly forbids the use of an lvalue swizzle that accesses the same component twice as an lvalue (`vec.xyyz = ...` is a compile error). So a similar swizzle operation would return a `T`, not a `T&`. That would make such things read accesses through a `const&` parameter, and therefore not provoke a data race. – Nicol Bolas Jun 07 '19 at 17:42
0
template<class...Ts>
std::tuple< std::remove_reference_t<T>* >
to_pointers( std::tuple<Ts...>& );

template<class T0, class...Ts>
std::array<T0, 1+sizeof...(Ts)>
to_array( std::tuple<T0, Ts...> const& );

write these.

int main() {
    data_t landing;
    std::atomic<int> completed = 0;

    // Whichever thread pings last will be the one performing foo()
    auto ping = [&](){
        if(++completed == 4) {
            foo(landing);
        }
    };

    auto arr = to_array(to_pointers(landing));

    std::thread a([&](){ *arr[0] = 1; ping(); });
    std::thread b([&](){ *arr[1] = 2; ping(); });
    std::thread c([&](){ *arr[2] = 3; ping(); });
    std::thread d([&](){ *arr[3] = 4; ping(); });

    a.join();
    b.join();
    c.join();
    d.join();

    return 0;
}

we access the tuple elements via pointers to them instead of via std::get. The problem is in the specification of std::get; once you have the independent pointers to the independent objects guaranteed to exist within the tuple, you are race condition free.

So we convert the tuple to an array of pointers in one thread (which is basically free), then use it in the threads safely.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • But now you have introduced indirection, and you are at the mercy of the compiler to optimize it away. – SergeyA Jun 07 '19 at 17:29
  • It's also a valid solution for this MCVE, but if the async operation's scope lives beyond the original calling function, those pointers will have to be heap-allocated alonside the landing. –  Jun 07 '19 at 17:32
  • @Frank Sure, or you could have the threads capture their pointers by value. `[ptr = arr[0]]{ *ptr = 1; ping(); }` . The point of this is twofold; (1) a tuple of identical types is a `std::array` really, and (2) storing references or pointers into the tuple is legal even if `std::get` isn't race-free. – Yakk - Adam Nevraumont Jun 07 '19 at 17:34