How can I disable c++ return value optimization for one type only?

Question

I have come across the situation where I really do need to execute non-trivial code in a copy-constructor/assignment-operator. The correctness of the algorithm depends on it.

While I could disable return value optimisation with a compiler switch, it seems a waste because it's only the one type I need it disabled for, so why should the performance of the whole application suffer? (Not to mention that my company would not allow me to add the switch, anyway).

struct A {
    explicit A(double val) : m_val(val) {}

    A(const A& other) : m_val(other.m_val) {
        // Do something really important here
    }
    A& operator=(const A& other) {
        if (&other != this) {
            m_val = other.m_val;
            // Do something really important here 
        }
        return *this;
    }
    double m_val;
};

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return retVal;
}
// Implement other operators like *,+,-,/ etc.

This class would be used as such:

A a1(3), a2(4), a3(5);
A a4 = (a1 + a2) * a3 / a1;

Return value optimisation means that a4 will not be created with the copy constructor, and the "really important thing to do" does not happen!

I know I could hack in a solution where operator+ returns a different type (B, say) and have an A constructor that takes a B as input. But then the number of operators needed to be implemented explodes:

B operator+(const A& a1, const A& a2);
B operator+(const B& a1, const A& a2);
B operator+(const A& a1, const B& a2);
B operator+(const B& a1, const B& a2);

There must be a better solution. How can I hack it so that RVO does not happen for my type? I can only change the A class code and the operators. I can't change the calling site code; i.e. I can't do this:

A a1(3), a2(4), a3(5);
A a4;
a4 = (a1 + a2) * a3 / a1;

One thing I've considered trying is to try and experiment with C++11 move constructors, but I'm not sure this would work, and I don't like it not being valid in C++03.

Any ideas?

EDIT: Please just accept that this is the only way I can do what I need to do. I cannot just 'change the design'. The calling code is fixed, and I must implement my strategy inside the mathematical operators and copy constructor & assignment operator. The idea is that the intermediate values calculated inside the "a4 = (a1+a2)*a3/a1" equation cannot be referenced anywhere else in the program - but a4 can. I know this is vague but you'll just have to live with it.

Can you explain why you need it? I think we'd do a better job of convincing you that you don't. — Joseph Mansfield, Apr 26 '13 at 10:41
The best idea seems to change algorithm so that it does not depend on how many times objects are copied. — Tadeusz Kopec for Ukraine, Apr 26 '13 at 10:42
Move constructors wouldn't come into it. If a copy is elided, then there is no scope for moving. I would say you're better off redesigning the code. — juanchopanza, Apr 26 '13 at 10:43
What about making `A` to have a static data-member and use it as a switch in copy-constructor return statement? `return ( m_staticCheck ? *this : A() );`... Ensuring `m_staticCheck` is always `1`, of course. I think, this will invalidate RVO... Probably, making it `volatile` too. — lapk, Apr 26 '13 at 10:43
You should delete the copy constructor, and create a custom method to do the really important thing (e.g. `a = b.copy();` instead of `a = b`.) — kennytm, Apr 26 '13 at 10:48
This seems like [an XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem), where you ask about your attempted solution rather than the real problem. Why is it "really important" to record a copy operation that was actually never performed? — Bo Persson, Apr 26 '13 at 10:54
Everyone, please see my edit. @KennyTM I can't change the calling code. — user2020792, Apr 26 '13 at 11:22
@PetrBudnik: Copy elision is allowed whether or not the copy-constructor does any funny business. — Mike Seymour, Apr 26 '13 at 11:25
Most things about programming are fundamentally simple, at least for programmers. This means you should be able to explain **why** you need the non-trivial copy constructor operation. If you can't explain it in a simple way, it's a hint there might be a problem with it. — Angew is no longer proud of SO, Apr 26 '13 at 11:34
@Angew Sorry, it's too complicated to explain. It really is. Not all programming is easy. There are heavy mathematical algorithms happening in the "important stuff". — user2020792, Apr 26 '13 at 11:57
@user2020792 The algorithms aren't the important part. That is what happens with their results, and why (you think) it needs to happen in the copy ctor. Still, we can't make you tell us; I still firmly believe this is an X-Y problem, but if you don't tell us the X, we can't help. The standard simply prevents Y. — Angew is no longer proud of SO, Apr 26 '13 at 12:02
@Angew The "very important thing" is used to mark a4 as a variable that can be referenced elsewhere in the program (i.e. can be used in another equation). On the flip-side, the intermediate variables in the equation (like a1+a2) *cannot* be referenced elsewhere (so they don't need to be marked). Does that help at all? — user2020792, Apr 26 '13 at 14:01

score 2 · Answer 1 · answered Apr 26 '13 at 13:44

Answering my own question here: I'm going to bite the bullet and use an intermediate type:

struct B;

struct A
{
    A(int i) : m_i(i) {}
    A(const B& a);
    A(const A& a) : m_i(a.m_i)
    {
        std::cout << "A(const A&)" << std::endl;
    }
    int m_i;
};
struct B
{
    B(int i) : m_i(i) {}
    int m_i;
};

A::A(const B& a) : m_i(a.m_i)
{
    std::cout << "A(const B&)" << std::endl;
}

B operator+(const A& a0, const A& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "A+A" << std::endl;
    return b;
}
B operator+(const B& a0, const A& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "B+A" << std::endl;
    return b;
}
B operator+(const A& a0, const B& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "A+B" << std::endl;
    return b;
}
B operator+(const B& a0, const B& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "B+B" << std::endl;
    return b;
}

int main()
{
    A a(1);
    A b(2);
    A c(3);
    A d = (a+b) + (a + b + c);
}

Output on GCC 4.2.1:

A+A
B+A
A+A
B+B
A(const B&)

And I can do the "very important thing" in the A(const B&) constructor.

Maybe you need the [Expression Template](http://en.wikipedia.org/wiki/Expression_templates) pattern. Its purpose is to eliminate the creation of large temporary objects. For instance, if `u` and `v` are long vectors of the same size, and `a` is a double, you can create a vector with the value `a*(u-v)` but without creating a temporary vector for `u-v`. — Derek Ledbetter, Apr 26 '13 at 18:23

dyp · Answer 2 · 2013-04-26T13:49:32.280

As Angew pointed out, you can use an intermediate type. Here's an example with some optimizations using the move ctor.

#include <utility>
#include <iostream>

struct B;

struct A {
    explicit A(double val) : m_val(val)
    {
        std::cout << "A(double)" << std::endl;
    }
    A(A&& p) : m_val(p.m_val)
    { /* no output */ }

    A(const A& other) : m_val(other.m_val) {
        // Do something really important here
        std::cout << "A(A const&)" << std::endl;
    }
    A& operator=(const A& other) {
        if (&other != this) {
            m_val = other.m_val;
            // Do something really important here
            std::cout << "A::operator=(A const&)" << std::endl;
        }
        return *this;
    }
    double m_val;

    A(B&&);
};

struct B
{
    operator A const&() const
    {
        std::cout << "B::operator A const&()" << std::endl;
        return a;
    }

private:
    friend struct A;
    A a;

    // better: befriend a factory function
    friend B operator+(const A&, const A&);
    friend B operator*(const A&, const A&);
    friend B operator/(const A&, const A&);
    B(A&& p) : a( std::move(p) )
    { /* no output */ }
};

A::A(B&& p) : A( std::move(p.a) )
{
    std::cout << "A(B&&)" << std::endl;
}

B operator+(const A& a1, const A& a2) {
    std::cout << "A const& + A const&" << std::endl;
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return std::move(retVal);
}

B operator*(const A& a1, const A& a2) {
    std::cout << "A const& * A const&" << std::endl;
    A retVal(a1.m_val * a2.m_val);
    // Do something else important
    return std::move(retVal);
}

B operator/(const A& a1, const A& a2) {
    std::cout << "A const& / A const&" << std::endl;
    A retVal(a1.m_val / a2.m_val);
    // Do something else important
    return std::move(retVal);
}

int main()
{
    A a1(3), a2(4), a3(5);
    A a4 = (a1 + a2) * a3 / a1;
}

IIRC, the temporary returned by, say a1 + a2 lasts for the whole copy-initialization (more precisely: for the whole full-expression, and that includes AFAIK the construction of a4). That's the reason why we can return an A const& from within B, even though the B objects are only created as temporaries. (If I'm wrong about that, see my previous edits for some other solutions.. :D )

The essence of this example is the combination of an intermediate type, move ctors and the said return of a reference.

Output of g++4.6.3 and clang++3.2:

A(double)             <---- A a1(3);
A(double)             <---- A a2(4);
A(double)             <---- A a3(5);
A const& + A const&   <---- a1 + a2;
A(double)               <-- A retVal(a1.m_val + a2.m_val);
B::operator A const&()<---- __temp__ conversion B --> const A&
A const& * A const&   <---- __temp__ * a3;
A(double)               <-- A retVal(a1.m_val * a2.m_val);
B::operator A const&()<---- __temp__ conversion B --> const A&
A const& / A const&   <---- __temp__ / a1;
A(double)               <-- A retVal(a1.m_val / a2.m_val);
A(B&&)                <---- A a4 = __temp__;

Now that the copy and move operations (which are not shown) are split up, I think you can implement your "something important" more precisely where it belongs to:

A(double) -- creation of a new A object from numerical values
A(A const&) -- actual copy of an A object; doesn't happen here
A(B&&) -- construction of an A object from an operator result
B(A&&) -- invoked for the return value of an operator
B::operator A const&() const -- invoked to use the return value of an operator

@user2020792 Then please tag your question with "C++03" next time. C++03 is a deprecated standard which has been replaced by C++11, though I admit that most compilers don't fully support the latter yet. — dyp, Apr 26 '13 at 13:51
I did explicitly state c++03 in my question, just not in the title. — user2020792, Apr 26 '13 at 21:55
@user2020792 I meant the tags, not the title. Unfortunately, your comment about move ctors and C++03 is not very clear to me as to if you want to see if it works with move ctors or C++03 only. Well now you have two answers :D but it would have been nice if it was clear you'd only want C++03 compatible answers. — dyp, Apr 26 '13 at 22:22

score 0 · Answer 3 · answered Apr 26 '13 at 11:45

RVO is allowed by the standard, in the following cases ([class.copy]§31, listing only applicable parts):

in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value

when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move

In your code:

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return retVal;
}


A a4 = (a1 + a2) * a3 / a1;

there are two elidable copies involved: copying revVal into temporary object storing return value of operator+, and copying this temporary object into a4.

I can't see a way to prevent elision of the second copy (the one from return value to a4), but the "non-volatile" part of the standard makes me believe this should prevent elision of the first copy:

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    volatile A volRetVal(retVal);
    return volRetVal;
}

Of course this means you'll have to define an additional copy constructor for A taking const volatile A&.

It's the prevention of the second elision that I need. What if the return type was changed to volatile? 'volatile A operator+(const A&, const A&)' — user2020792, Apr 26 '13 at 12:00
@user2020792 That wouldn't help. The second point (the one about copying from temporary objects) doesn't make exceptions for `volatile`. Your only hope there could be an intermediate type for the return value. — Angew is no longer proud of SO, Apr 26 '13 at 12:03

How can I disable c++ return value optimization for one type only?

3 Answers3

Linked