15

Consider the case when "whole" objects with move semantics enabled are returned from functions, as with std::basic_string<>:

std::wstring build_report() const
{
    std::wstring report;
    ...

    return report;
}

Can I then realistically be expected to make the "best" choice whether to use the returned string with move semantics, as in

const std::wstring report(std::move(build_report()));

or if I should rely on (N)RVO to take place with

const std::wstring report(build_report());

or even bind a const reference to the temporary with

const std::wstring& report(build_report());

What scheme is there to make a deterministic choice of these options, if any?

EDIT 1: Note that the usage of std::wstring above is just an example of a move semantics enabled type. It just as well be swapped for your arbitrary_large_structure. :-)

EDIT 2: I checked the generated assembly when running a speed-optmized release build in VS 2010 of the following:

std::wstring build_report(const std::wstring& title, const std::wstring& content)
{
    std::wstring report;
    report.append(title);
    report.append(content);

    return report;
}

const std::wstring title1(L"title1");
const std::wstring content1(L"content1");

const std::wstring title2(L"title2");
const std::wstring content2(L"content2");

const std::wstring title3(L"title3");
const std::wstring content3(L"content3");

int _tmain(int argc, _TCHAR* argv[])
{
    const std::wstring  report1(std::move(build_report(title1, content1)));
    const std::wstring  report2(build_report(title2, content2));
    const std::wstring& report3(build_report(title3, content3));

    ...

    return 0;
}

The 2 most interesting outcomes:

  • Explicitly calling std::move for report1 to use the move constructor triples the instruction count.
  • As noted by James McNellis in his answer below, report2 and report3 does indeed generate identical assembly with 3 times fewer instructions than explicitly calling std::move.
Community
  • 1
  • 1
Johann Gerell
  • 24,991
  • 10
  • 72
  • 122
  • I think it is _really_ weird that the call to `move` is not inlined and eliminated. – James McNellis Jun 30 '11 at 11:58
  • 1
    @James: Yes, especially in the light of Kerrek SB's comment that STL said RVO happens *before* anything else. (http://stackoverflow.com/questions/6531700/when-will-a-c0x-compiler-make-rvo-and-nrvo-outperform-move-semantics-and-const/6533181#6533181) You're not by any chance located close to STL so that you can walk over and ask him? ;-) – Johann Gerell Jun 30 '11 at 12:43
  • possible duplicate of [C++0x rvalues and move semantics confusion](http://stackoverflow.com/questions/4986673/c0x-rvalues-and-move-semantics-confusion) – Howard Hinnant Jun 30 '11 at 13:35
  • @Howard: While your answer to that question is very good, I think this question addresses a deeper issue. – Johann Gerell Jun 30 '11 at 13:55
  • I do spend a fair amount of time in the building where he works, yes :-) I haven't actually met Mr. STL himself yet, though. I will follow up with them with regard to why `std::move` is not inlined here, though; I really do find that puzzling. I am on vacation through the end of next week, so it will be a couple weeks before I can get back to you. – James McNellis Jul 01 '11 at 01:27

3 Answers3

13

std::move(build_report()) is wholly unnecessary: build_report() is already an rvalue expression (it is a call of a function that returns an object by value), so the std::wstring move constructor will be used if it has one (it does).

Plus, when you return a local variable, it gets moved if it is of a type that has a move constructor, so no copies will be made, period.

There shouldn't be any functional difference between declaring report as an object or as a const-reference; in both cases you end up with an object (either the named report object or an unnamed object to which the report reference can be bound).

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • *The other two should be functionally identical in the presence of (N)RVO* - interesting. I haven't thought about that. I'll take a look at the generated assembly. I'll be back! – Johann Gerell Jun 30 '11 at 08:09
  • I edited that a bit: there shouldn't ever be a difference between those two: practically speaking, in both cases you have to have some object in the scope of `report`; the only "difference" is whether `report` is the name of the object or is a reference bound to that object... from a performance standpoint there should be no difference between the two. – James McNellis Jun 30 '11 at 08:14
  • Please see my **Edit 2** in the question. – Johann Gerell Jun 30 '11 at 09:46
4

I'm not sure if this is standardized (as Nicol says, all optimizations are up to the compiler), but I heard STL talk about this and (at least in MSVC), RVO happens before anything else. So if there's a chance to apply RVO, then that'll happen without any action on your part. Second, when you return a temporary, you don't have to write std::move (I think this is actually in the standard), since the return value will implicitly be treated as an rvalue.

The upshot is: Don't second-guess the compiler, just write the most natural-looking code and it'll give you the best-possible result.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • In my **Edit 2** one can see that explicitly calling `std::move` will even prevent (N)RVO altogether. – Johann Gerell Jun 30 '11 at 10:29
  • 1
    @Johann: Cool. I suppose that shows that RVO is only considered if you return the naked temporary directly. One more reason not to try to outsmart the language! – Kerrek SB Jun 30 '11 at 10:33
3

What scheme is there to make a deterministic choice of these options, if any?

There isn't one, and there never will be.

Compilers are not required to do optimizations of any kind. The only thing you can do for certain is compile some code and see what comes out the other end.

The most you will eventually get is a general heuristic, a community consensus where people say, "for most compilers, X seems to work fastest." But that's about it. And that will take years as compilers get up to speed with C++0x and implementations mature.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982