Sink arguments and move semantics for functions that can fail (strong exception safety)

Question

I have a function that operates on a big chunk of data passed in as a sink argument. My BigData type is already C++11-aware and comes with fully functional move constructor and move assignment implementations, so I can get away without having to copy the damn thing:

Result processBigData(BigData);

[...]

BigData b = retrieveData();
Result r = processBigData(std::move(b));

This all works perfectly fine. However, my processing function may fail occasionally at runtime resulting in an exception. This is not really a problem, since I can just fix stuff and retry:

BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(std::runtime_error&) {
    r = fixEnvironmnentAndTryAgain(b);
    // wait, something isn't right here...
}

Of course, this won't work.

Since I moved my data into the processing function, by the time I arrive in the exception handler, b will not be usable anymore.

This threatens to drastically reduce my enthusiasm for passing sink arguments by-value.

So here is the question: How to deal with a situation like this in modern C++ code? How to retrieve access to data that was previously moved into a function that failed to execute?

You may change the implementation and interfaces for both BigData and processBigData as you please. The final solution however should try to minimize drawbacks over the original code regarding efficiency and usability.

Important question, does Result contain the moved resources of b or is just based on it? — IdeaHat, Sep 04 '14 at 14:05
@MadScienceDreams The `Result` is just calculated from `b`, it does _not_ contain a reference to, or copy of the original `b`. — ComicSansMS, Sep 04 '14 at 14:06
@ComicSansMS But does it contain the moved (as opposed to copied) contents? — Potatoswatter, Sep 04 '14 at 14:09
Then there is no reason to pass is by rhr. Whenever you call std::move, then you are making the agreement that the value is gone after the function call. While there may be tricks to get around this (it is not guaranteed to be gone, just you agreed that it could be gone), the correct way to pass a value that you don't want to have side effects on (even in modern c++) is const reference. — IdeaHat, Sep 04 '14 at 14:10
@Potatoswatter If that helps you to solve the problem, feel free to assume that it does. In my code as it stands now however, `b` is discarded by the processing function as soon as it returns. — ComicSansMS, Sep 04 '14 at 14:11
@MadScienceDreams In my particular situation, passing by `const&` would require me to copy the whole data (I actually ran into this problem in an asynchronous processing function, so ownership of the data really needs to be moved to the function). The best solution I could think of to avoid a copy is to use a shared_ptr and go through the heap, but I'd like to avoid that if possible. — ComicSansMS, Sep 04 '14 at 14:18
@ComicSansMS So "result" doesn't consume BigData's resources, but a side-effect of the function does? I don't think you can get around using a shared-pointer for such ambiguous ownership...(Since BigData has resources that can be sped up by the move, I assume that it is already using heap resources). — IdeaHat, Sep 04 '14 at 14:24
@MadScienceDreams Pretty much. Although I don't see how the fact who eventually consumes the resource changes the outcome. Assuming that it would be consumed by the result, would that allow for a nicer solution? — ComicSansMS, Sep 04 '14 at 14:27
@ComicSansMS Nope, just be more clear on what is going on. One possible (but insane) solution would be to have a output variable that has the (potentially) filled in object. `Result processBigData(BigData&& in_ref, BigData* out_ref=NULL){BigData whereitmoves(std::foreward(in_ref));try{/*old method*/}catch(...){if (out_ref){*out_ref=std::move(whereitmoves);} std::rethrow_exception(std::current_exception());}` — IdeaHat, Sep 04 '14 at 16:56

score 3 · Answer 1 · answered Sep 04 '14 at 14:05

3

I'm similarly nonplussed by this issue.

As far as I can tell, the best current idiom is to divide the pass-by-value into a pair of pass-by-references.

template< typename t >
std::decay_t< t >
val( t && o ) // Given an object, return a new object "val"ue by move or copy
    { return std::forward< t >( o ); }

Result processBigData(BigData && in_rref) {
    // implementation
}

Result processBigData(BigData const & in_cref ) {
    return processBigData( val( in_cref ) );
}

Of course, bits and pieces of the argument might have been been moved before the exception. The problem propagates out to whatever processBigData calls.

I've had an inspiration to develop an object that moves itself back to its source upon certain exceptions, but that's a solution to a particular problem on the horizon in one of my projects. It might end up too specialized, or it might not be feasible at all.

answered Sep 04 '14 at 14:05

Potatoswatter

134,909
25
265
421

I'm still not sure why you want to pass a RHR for the function if it doesn't actually consume BigData...I guess to allow it to consume it in the future? Also, what do you do in the case of a non-copyable type (like `std::unique_ptr`)? – IdeaHat Sep 04 '14 at 14:19
Yeah, adding an overload for rvalue refs would indeed help here. You could defer moving from `BigData` to a point at which you can guarantee that no exception will occur. Of course this suffers from the usual drawbacks of having to introduce rvalue-ref overloads on function interfaces (hence I'm not 100% convinced it satisfies my usability constraint), but it might actually work out fine in certain situations. +1 either way. – ComicSansMS Sep 04 '14 at 14:29
@MadScienceDreams 1. Yes, given the clarification comments under the question, I'm not sure why it's not a `const &` in the first place, but I just answered the question conceptually. 2. Non-copyable types wouldn't need the `const &` overload, that's all. – Potatoswatter Sep 04 '14 at 15:05

ComicSansMS · Accepted Answer · 2020-10-21T15:52:37.913

Apparently this issue was discussed lively at the recent CppCon 2014. Herb Sutter summarized the latest state of things in his closing talk, Back to the Basics! Essentials of Modern C++ Style (slides).

His conclusion is quite simply: Don't use pass-by-value for sink arguments.

The arguments for using this technique in the first place (as popularized by Eric Niebler's Meeting C++ 2013 keynote C++11 Library design (slides)) seem to be outweighed by the disadvantages. The initial motivation for passing sink arguments by-value was to get rid of the combinatorial explosion for function overloads that results from using const&/&&.

Unfortunately, it seems that this brings a number of unintended consequences. One of which are potential efficiency drawbacks (mainly due to unnecessary buffer allocations). The other is the problem with exception safety from this question. Both of these are discussed in Herb's talk.

Herb's conclusion is to not use pass-by-value for sink arguments, but instead rely on separate const&/&& (with const& being the default and && reserved for those few cases where optimization is required).

This also matches with what @Potatoswatter's answer suggested. By passing the sink argument via && we might be able to defer the actual moving of the data from the argument to a point where we can give a noexcept guarantee.

I kind of liked the idea of passing sink arguments by-value, but it seems that it does not hold up as well in practice as everyone hoped.

Update after thinking about this for 5 years:

I am now convinced that my motivating example is a misuse of move semantics. After the invocation of processBigData(std::move(b));, I should never be allowed to assume what the state of b is, even if the function exits with an exception. Doing so leads to code that is hard to follow and to maintain.

Instead, if the contents of b should be recoverable in the error case, this needs to be made explicit in the code. For example:

class BigDataException : public std::runtime_error {
private:
    BigData b;
public:
    BigData retrieveDataAfterError() &&;

    // [...]
};


BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(BigDataException& e) {
    b = std::move(e).retrieveDataAfterError();
    r = fixEnvironmnentAndTryAgain(std::move(b));
}

If I want to recover the contents of b, I need to explicitly pass them out along the error path (in this case wrapped inside the BigDataException). This approach requires a bit of additional boilerplate, but it is more idiomatic in that it does not require making assumptions about the state of a moved-from object.

Sink arguments and move semantics for functions that can fail (strong exception safety)

2 Answers2

Linked