42

I was answering a question and recommending return by-value for a large type because I was confident the compiler would perform return-value optimization (RVO). But then it was pointed out to me that Visual Studio 2013 was not performing RVO on my code.

I've found a question here regarding Visual Studio failing to perform RVO but in that case the conclusion seemed to be that if it really matters Visual Studio will perform RVO. In my case it does matter, it makes a significant impact to performance which I've confirmed with profiling results. Here is the simplified code:

#include <vector>
#include <numeric>
#include <iostream>

struct Foo {
  std::vector<double> v;
  Foo(std::vector<double> _v) : v(std::move(_v)) {}
};

Foo getBigFoo() {
  std::vector<double> v(1000000);
  std::iota(v.begin(), v.end(), 0);  // Fill vector with non-trivial data

  return Foo(std::move(v));  // Expecting RVO to happen here.
}

int main() {
  std::cout << "Press any key to start test...";
  std::cin.ignore();

  for (int i = 0; i != 100; ++i) {  // Repeat test to get meaningful profiler results
    auto foo = getBigFoo();
    std::cout << std::accumulate(foo.v.begin(), foo.v.end(), 0.0) << "\n";
  }
}

I'm expecting the compiler to perform RVO on the return type from getBigFoo(). But it appears to be copying Foo instead.

I'm aware that the compiler will create a copy-constructor for Foo. I'm also aware that unlike a compliant C++11 compiler Visual Studio does not create a move-constructor for Foo. But that should be OK, RVO is a C++98 concept and works without move-semantics.

So, the question is, is there a good reason why Visual Studio 2013 does not perform return value optimization in this case?

I know of a few workarounds. I can define a move-constructor for Foo:

Foo(Foo&& in) : v(std::move(in.v)) {}

which is fine, but there are a lot of legacy types out there that don't have move-constructors and it would be nice to know I can rely on RVO with those types. Also, some types may be inherently copyable but not movable.

If I change from RVO to NVRO (named return value optimization) then Visual Studio does appear to perform the optimization:

  Foo foo(std::move(v))
  return foo;

which is curious because I thought NVRO was less reliable than RVO.

Even more curious is if I change the constructor of Foo so it creates and fills the vector:

  Foo(size_t num) : v(num) {
    std::iota(v.begin(), v.end(), 0);  // Fill vector with non-trivial data
  }

instead of moving it in then when I try to do RVO, it works:

Foo getBigFoo() {
  return Foo(1000000);
}

I'm happy to go with one of these workarounds but I'd like to be able to predict when RVO might fail like this in the future, thanks.

Edit: More concise live demo from @dyp

Edit2: Why don't I just write return v;?

For a start, it doesn't help. Profiler results show that Visual Studio 2013 still copies the vector if I just write return v; And even if it did work it would only be a workaround. I'm not trying to actually fix this particular piece of code, I'm trying to understand why RVO fails so I can predict when it might fail in the future. It is true that it is a more concise way of writing this particular example but there are plenty of cases where I couldn't just write return v;, for example if Foo had additional constructor parameters.

Community
  • 1
  • 1
Chris Drew
  • 14,926
  • 3
  • 34
  • 54
  • A total stab in the dark: do you observe different behavior if you used `return Foo(std::move(v));` (parentheses) instead of `return Foo{std::move(v)};` (curly braces)? I don't think it would make a difference but I'm not willing to bet on that. – In silico Sep 21 '14 at 20:53
  • @Insilico, No, same behaviour, I was only using curly braces (uniform initialization) there because it was getting a little bit too close to the most vexing parse for comfort. I'll change it unless it throws anyone... – Chris Drew Sep 21 '14 at 20:57
  • 3
    Well you could, of course, use `return {std::move(v)};` since that constructor is not explicit. This does not require any (N)RVO, it is specified not to create a temporary. – dyp Sep 21 '14 at 21:00
  • 2
    [Live example: RVO is not applied](http://rextester.com/KNLR27698) -- [Live example: with `return {move(v)};`](http://rextester.com/BEPGC63516) – dyp Sep 21 '14 at 21:07
  • @dyp. Interesting, another workaround. I think I might be a little surprised if I saw code like that but maybe I should start getting used to it. – Chris Drew Sep 21 '14 at 21:11
  • 7
    Why don't you just write `return v;`? – Marc Glisse Sep 21 '14 at 21:18
  • @MarcGlisse, that seems to suffer from the same problem. – Chris Drew Sep 21 '14 at 21:35
  • 2
    I just tried it on Visual Studio 2014 CTP and it applies RVO for your code. EDIT: @dyp's example I should say. – Jagannath Sep 21 '14 at 23:59
  • are you compiling in Release mode? (VS is known to suppress RVO in Debug mode in some cases) – M.M Sep 22 '14 at 01:28
  • @McNabb Yes, compiling in release mode. – Chris Drew Sep 22 '14 at 01:49
  • @dyp Interestingly, I can get your first example to not fail, when I change the Foo constructor to accept an rvalue ref instead of a moved value, [live example](http://rextester.com/ZCS19716) – Michael Karcher Oct 21 '14 at 06:03
  • RVO is a special optimization as it can change the behavior of your program. You can easily test RVO by returning an instance of your own class and logging out the constructor calls. – pasztorpisti Oct 21 '14 at 08:56
  • @pasztorpisti I know. There is no doubt that RVO is failing. I've added a link to dyp's live demo that does exactly what you suggest. But I wanted to show it made an actual difference to performance, hence why my example is written as it is. – Chris Drew Oct 21 '14 at 09:02
  • @MichaelKarcher Your live example still prints `fail`? `std::move` does not prevent the copy/move elision of copying/moving the `Foo` temporary to the return value of the function. – dyp Oct 21 '14 at 13:13
  • @dyp Sorry, I didn't get the difference between fork mode and edit mode yet. I wanted to make the point, that changing the foo constructor to `Foo(vec_t && _v) : v(std::move(_v)) {}` does not make it fail, but I messed up. Likely, the live example is correct now. – Michael Karcher Oct 21 '14 at 18:26
  • 1
    @MichaelKarcher Ah, now I see. This is strange indeed: RVO is performed in your example, but not in mine. I remember having read something about VS having problems with RVO due to stack unwinding in case of an exception. In my example, a temporary `vector` needs to be created. Maybe that's the underlying issue. (But I can't find any links atm.) – dyp Oct 21 '14 at 18:53
  • @kuroineko: What complete nonsense. – Lightness Races in Orbit Nov 09 '14 at 13:18
  • @ChrisDrew: Lol how do you get the most vexing parse in the operand of a `return` statement? – Lightness Races in Orbit Nov 09 '14 at 13:18
  • Send a pointer (or a smart pointer) on function getBigFoo. VS is know for doing weird things. Double check the compiling options, release, optimizations... – Juan Chô Nov 11 '14 at 16:59
  • 1
    Look here https://connect.microsoft.com/VisualStudio/feedback/details/846490/c-copy-elision-failure-no-return-value-optimization-rvo-when-return-value-is-constructed-by-defaulted-default-constructor – Elvis Dukaj Nov 12 '14 at 04:03
  • @elvis.dukaj That certainly looks related, although it is not the same. In this case I am not using a defaulted default-constructor. – Chris Drew Nov 12 '14 at 04:19
  • I know this won't help in your understanding your problem, but it doesn't seems wise to trust the compilers to do things they may do, but are not required to... even if you believe you can reliable predict the behavior from one specific compiler from one specific vendor, since your code may some day be ported to other platforms, or the behavior may change with next compiler version. – lvella Nov 12 '14 at 04:26
  • @Ivella Maybe you are right. Maybe a good rule of thumb is to only use return by value if the type is cheap to move then it is not the end of the world if the compiler fails to elide the move. It seems a shame though. Return by value is so much nicer than the alternatives and it works most of the time. – Chris Drew Nov 12 '14 at 05:22
  • This is likely unrelated, but I recall an earlier version of Visual C++ where RVO was not kicking in unless the return value was const. – zumalifeguard Nov 12 '14 at 17:05
  • Have you tried this code with classes instead of structs? You may be surprised by the results. I will investigate and get back to you on VS2013's handling of structs vs classes. – GMasucci Nov 14 '14 at 09:23
  • 2
    I've posted some details on when RVO is performed and when it fails (based on example from @dyp) here: http://www.rovrov.com/blog/2014/11/21/RVO-and-copy-elision-failing/. This does not explain *why* RVO is failing but some observations might still be interesting. – Roman L Nov 21 '14 at 22:37
  • @LightnessRacesinOrbit someone censored my "complete nonsense", but I can understand a person earning a living explaining what `struct : bar {} foo {};` means to a C++ compiler could be touchy on the subject of inherent obfuscation and unstability of her pet language. – kuroi neko Dec 11 '14 at 02:16
  • @kuroineko: What? Doesn't look like _I'm_ the touchy one, mate... What did your comment say? I don't remember. – Lightness Races in Orbit Dec 11 '14 at 10:13
  • 1
    I've submitted [a bug to microsoft](https://connect.microsoft.com/VisualStudio/feedback/details/1036698/visual-c-compiler-failing-to-perform-the-return-value-optimization). Not had any feedback so far. – Chris Drew Dec 11 '14 at 10:49

1 Answers1

4

If the code looks like it should be optimized, but is not getting optimized I would submit bug here http://connect.microsoft.com/VisualStudio or raise a support case with Microsoft. This article, although it is for VC++2005 (I couldn't find a current version of document) does explain some scenarios where it won't work. http://msdn.microsoft.com/en-us/library/ms364057(v=vs.80).aspx#nrvo_cpp05_topic3

If we want to be sure the optimization has occurred, one possibility is to check the assembly output. This could be automated as a build task if desired.

This requires generating .asm output using /FAs option like so:

cl test.cpp /FAs

Will generate test.asm.

A potential example in PowerShell below, which can be used in this way:

PS C:\test> .\Get-RVO.ps1 C:\test\test.asm test.cpp
NOT RVO test.cpp - ; 13   :   return Foo(std::move(v));// Expecting RVO to happen here.

PS C:\test> .\Get-RVO.ps1 C:\test\test_v2.optimized.asm test.cpp
RVO OK test.cpp - ; 13   :   return {std::move(v)}; // Expecting RVO to happen here.

PS C:\test> 

The script:

# Usage Get-RVO.ps1 <input.asm file> <name of CPP file you want to check>
# Example .\Get-RVO.ps1 C:\test\test.asm test.cpp
[CmdletBinding()]
Param(
[Parameter(Mandatory=$True,Position=1)]
  [string]$assemblyFilename,

  [Parameter(Mandatory=$True,Position=2)]
  [string]$cppFilename
)

$sr=New-Object System.IO.StreamReader($assemblyFilename)
$IsInReturnSection=$false
$optimized=$true
$startLine=""
$inFile=$false

while (!$sr.EndOfStream)
{
    $line=$sr.ReadLine();

    # ignore any files that aren't our specified CPP file
    if ($line.StartsWith("; File"))
    {
        if ($line.EndsWith($cppFilename))
        {
            $inFile=$true
        }
        else
        {
            $inFile=$false
        }
    }

    # check if we are in code section for our CPP file...
    if ($inFile)
    {
        if ($line.StartsWith(";"))
        {
            # mark start of "return" code
            # assume optimized, unti proven otherwise
            if ($line.Contains("return"))
            {
                $startLine=$line 
                $IsInReturnSection=$true
                $optimized=$true
            }
        }

        if ($IsInReturnSection)
        {
            # call in return section, not RVO
            if ($line.Contains("call"))
            {
                $optimized=$false
            }

            # check if we reached end of return code section
            if ($line.StartsWith("$") -or $line.StartsWith("?"))
            {
                $IsInReturnSection=$false
                if ($optimized)
                {
                    "RVO OK $cppfileName - $startLine"
                }
                else
                {
                    "NOT RVO $cppfileName - $startLine"
                }
            }
        }
    }

}
Malcolm McCaffery
  • 2,468
  • 1
  • 22
  • 43
  • I don't really want to have to check the assembly every time I want to use RVO! If this is necessary I think I will just not rely on RVO. I appreciate the instructions on how to generate/check the assembly though. I've confirmed Jagannath's observation that it is fixed in Visual Studio 2014 CTP so there is probably no point submitting a bug. – Chris Drew Nov 20 '14 at 03:00
  • This is the only possible way to be 100% sure, because even if you follow the logic as mentioned in Microsoft article I linked, a compiler bug may be there, or you hit a scenario that is not-obvious. However the script can be used to automate check post-build if it's very important. If you really want to know check the actual logic, I think you'd have to use an open source compiler CLang or GCC. – Malcolm McCaffery Nov 20 '14 at 03:30
  • That being said, if you have really well documented, reproducible scenario, Microsoft will typically respond to the bug report, it may be concluded "by design" but at least you may get a reason. – Malcolm McCaffery Nov 20 '14 at 03:31
  • 1
    @ChrisDrew "If this is necessary I think I will just not rely on RVO." -- Whenever possible, I strongly think that's exactly what you should do. There will always be programs where RVO is permitted by the standard but not performed by the compiler. It's practically impossible to guarantee that RVO is *always* performed whenever permitted, because when the to-be-returned object is constructed, the compiler may not be able to determine which return statement will end up executed. Optimise your move constructors, and RVO should have a significantly smaller impact. –  Mar 29 '15 at 10:18
  • @ChrisDrew And unfortunately, that means spelling out each and every move constructor, if you're dealing with a compiler that won't generate them implicitly even though the standard says it should. –  Mar 29 '15 at 10:20