97

I am currently studying how to write efficient C++ code, and on the matter of function calls, a question comes to mind. Comparing this pseudocode function:

not-void function-name () {
    do-something
    return value;
}
int main () {
    ...
    arg = function-name();
    ...
}

with this otherwise-identical pseudocode function:

void function-name (not-void& arg) {
    do-something
    arg = value;
}
int main () {
    ...
    function-name(arg);
    ...
}

Which version is more efficient, and in what respect (time, memory etc.)? If it depends, then when would the first be more efficient and when would the more efficient be the second?

Edit: For context, this question is limited to hardware platform-independent differences, and for the most part software too. Are there any machine-independent performance difference?

Edit: I don't see how this is a duplicate. The other question is comparing passing by reference (prev. code) to passing by value (below):

not-void function-name (not-void arg)

Which is not the same thing as my question. My focus is not on which is the better way to pass in an argument to a function. My focus is on which is the better way to pass out a result to a variable from the outside scope.

thegreatjedi
  • 2,788
  • 4
  • 28
  • 49
  • 12
    Why don't you just try it? Presumably it depends on your platform and compiler. Do it a million times and profile it. Also, in general, write the code how it is most clear and only worry about optimizations if you need to increase performance. – xaxxon Nov 30 '15 at 09:19
  • 1
    Try both of the versions a couple of million times, while timing the calls. Do it both without and with optimizations enabled. Considering return-value optimizations and copy-elision, I doubt you find any big differences either way. – Some programmer dude Nov 30 '15 at 09:19
  • Your example doesn't even do the same thing, also it depends heavily on how big is the argument, almost always more efficient to pass by reference thou – Pedro Sassen Veiga Nov 30 '15 at 09:20
  • Depends on the type and function. You can't make statements about low level efficiency based on pseudocode. By default,I'd return by value – MikeMB Nov 30 '15 at 09:20
  • @xaxxon I don't know how to do that (heck, I just learnt of the existence of profiling today lol) – thegreatjedi Nov 30 '15 at 09:22
  • then the answer to this question isn't meaningful to you at all. Just write the most straightforward code you can. – xaxxon Nov 30 '15 at 09:23
  • 3
    https://en.wikipedia.org/wiki/Program_optimization#When_to_optimize – xaxxon Nov 30 '15 at 09:25
  • 1
    @xaxxon My job involves working with hardware with resource constraints, and any marginal gain in performance is critical in this industry. I need to at least understand how it works even if I don't have the opportunity to try it out for myself. – thegreatjedi Nov 30 '15 at 09:28
  • 3
    @Pedro: Thanks to copy elision and move semantics, there are a lot of cases, where pass / return by value is actually more efficient. – MikeMB Nov 30 '15 at 09:32
  • @thegreatjedi: The Problem is, there isn't a simple answer to this. It depends on a lot of things (OS, Processor, compiler, compiler settings, calling and called code... If you are worried about 5% performance gain, you can make a lot of assumptions, but at the end you have to verify them for the concrete case at hand. – MikeMB Nov 30 '15 at 09:41
  • the question it's marked as duplicate of is about arguments, not return values – sp2danny Nov 30 '15 at 09:46
  • 1
    Out parameters are terrible anyway and it's not related to performance at all. We don't write code only looking at how fast it performs. – Bartek Banachewicz Nov 30 '15 at 09:50
  • 7
    Your job involves writing code and you've just learned of profiling? Go learn how to profile. That will help you a lot more than anything in this question. And if you're on hardware that constrained, then without information specific to that device, nothing here will be known to be true. – xaxxon Nov 30 '15 at 09:52
  • @xaxxon well I won't say the hardware we work with is resource-constrained in the conventional sense. But any performance gain, however minute, gives an edge to mission success. – thegreatjedi Nov 30 '15 at 09:56
  • 1
    @thegreatjedi any performance gain, however minute, can be only measured, not guessed. If you are into low level optimization, you should look at the assembly generated by the compiler in _each_ case, and choose _each_ case based on that. Even if you run an experiment with a test function this can not be generalized to all cases you have in your application. Concentrate in having a good design and write readable code: in your case, should the function be instead a constructor for the object? – pqnet Nov 30 '15 at 13:56
  • @xaxxon I think many questions being asked here can fall into this "Why don't you just try it?" category, but still these questions are worthy to be posted because the answers provide quick references reviewed by many professional programmers. If a test is really needed to answer the question, it will also be substantially beneficial to provide test codes. These test codes, as getting more and more viewed, can often be improved and become very sophisticated. – Anthony May 09 '21 at 22:15

7 Answers7

38

First of all, take in account that returning an object will always be more readable (and very similar in performance) than having it passed by reference, so could be more interesting for your project to return the object and increase readability without having important performance differences. If you want to know how to have the lowest cost, the thing is what do you need to return:

  1. If you need to return a simple or basic object, the performance would be similar in both cases.

  2. If the object is so large and complex, returning it would need a copy, and it could be slower than having it as a referenced parameter, but it would spend less memory I think.

You have to think anyway that compilers do a lot of optimizations which make both performances very similar. See Copy Elision.

arodriguezdonaire
  • 5,396
  • 1
  • 26
  • 50
  • 2
    What about copy elision? – juanchopanza Nov 30 '15 at 09:29
  • 8
    Actually, on x86 - and ignoring compiler optimizations - both would create the same assembly code, because return values that are bigger than 1 or 2 ? registers are passed via a memory region, which is allocated by the caller and passed to the callee via in implicit pointer parameter. – MikeMB Nov 30 '15 at 10:06
11

Returning the object should be used in most cases because of an optimsation called copy elision.

However, depending on how your function is intended to be used, it may be better to pass the object by reference.

Look at std::getline for instance, which takes a std::string by reference. This function is intended to be used as a loop condition and keeps filling a std::string until EOF is reached. Using the same std::string allows the storage space of the std::string to be reused in every loop iteration, drastically reducing the number of memory allocations that need to be performed.

Simple
  • 13,992
  • 2
  • 47
  • 47
10

Well, one must understand that compilation is not an easy buisness. there are many consideration taken when the compiler compiles your code.

One can't simply answer this question because the C++ standard doesn't provide standard ABI (abstract binary interface), so each compiler is allowed to compile the code whatever it likes and you can get different results in each compilation.

For example, on some projects C++ is compiled to managed extension of Microsoft CLR (C++/CX). since everything there is already a reference to an object on the heap, I guess there is not difference.

The answer is not simpler for un-managed compilations. several quaestion come to mind when I think about "Will XXX run faster then YYY?", for example:

  • Is you object deafult-constructible?
  • Does your compiler support return-value-optimization?
  • Does your object support Copy-only semantics or both copy and move?
  • Is the object packed in contigious manner (e.g. std::array) or it has pointer to something on the heap? (e.g. std::vector)?

If I give concrete example, my guess is that on MSVC++ and GCC, returning std::vector by value will be the as passing it by reference, because of r-value-optimization, and will be a bit (by few nanoseconds) faster then returning the vector by move. this may be completly different on Clang, for example.

eventually, profiling is the only true answer here.

David Haim
  • 25,446
  • 3
  • 44
  • 78
5

Some of the answers have touched on this, but I would like to emphasize in light of the edit

For context, this question is limited to hardware platform-independent differences, and for the most part software too. Are there any machine-independent performance difference?

If this is the limits of the question, the answer is that there is no answer. The c++ spec does not stipulate how either the return of an object or a passing by reference is implemented performance wise, only the semantics of what they both do in terms of code.

A compiler is therefore free to optimize one to identical code as the other assuming this does not create a perceptible difference to the programmer.

In light of this, I think it is best to use whichever is the most intuitive for the situation. If the function is indeed "returning" an object as the result of some task or query, return it, while if the function is performing an operation on some object owned by the outside code, pass by reference.

You cannot generalize performance on this. As a start, do whatever is intuitive and see how well your target system and compiler optimizes it. If after profiling you will discover a problem, change it if you need to.

Alex Yursha
  • 3,208
  • 3
  • 26
  • 25
Vality
  • 6,577
  • 3
  • 27
  • 48
4

We can't be 100% general because different platforms have different ABIs but I think we can make some fairly general statements that will apply on most implementations with the caveat that these things mostly apply to functions that are not inlined.

Firstly lets consider primitive types. At a low level a parameter pass by reference is implemented using a pointer whereas primitive return values are typically passed literally in registers. So return values are likely to perform better. On some architectures the same applies to small structures. Copying a value small enough to fit in a register or two is very cheap.

Now lets consider larger but still simple (no default constructors, copy constructors etc) return values. Typically larger return values are handled by passing the function a pointer to the location where the return value should be put. Copy elision allows the variable returned from the function, the temporary used for return and the variable in the caller that the result is placed into to be merged into one. So the basics of passing would be much the same for pass by reference and return value.

Overall for primitive types I would expect return values to be marginally better and for larger but still simple types I would expect them to be the same or better unless your compiler is very bad at copy elision.

For types that use default constructors, copy constructors etc things get more complex. If the function is called multiple times then return values will force the object to be re-constructed each time whereas reference parameters may allow the data structure to be reused without being reconstructed. On the other hand reference parameters will force a (possibly unnecessary) construction before the function is called.

plugwash
  • 9,724
  • 2
  • 38
  • 51
2

This pseudocode function:

not-void function-name () {
    do-something
    return value;
}

would be better used when the returned value does not require any further modifications onto it. The parameter passed is only modified in the function-name. There are no more references required to it.


otherwise-identical pseudocode function:

void function-name (not-void& arg) {
    do-something
    arg = value;
}

would be useful if we have another method moderating the value of the same variable like and we need to keep the changes made to the variable by either of the call.

void another-function-name (not-void& arg) {
    do-something
    arg = value;
}
Naman
  • 27,789
  • 26
  • 218
  • 353
1

Performance-wise, copies are generally more expensive, although the difference might be negligible for small objects. Also, your compiler might optimize a return copy into a move, making equivalent to passing a reference.

I'd recommend not passing non-const references unless you have a good reason to. Use the return value (e.g. functions of the tryGet() sort).

If you want you can measure yourself the difference, as others have said already. Run the test code a few million times for both versions and see the difference.

user
  • 6,897
  • 8
  • 43
  • 79
elnigno
  • 1,751
  • 14
  • 37
  • I'd like to point out that **non-** `const` references will result in possibly more problems due to the implicit state of a reference. Whenever you change a reference, we must make sure that any other arguments' states are also not changed by the reference. – CinchBlue Nov 30 '15 at 09:34
  • In addition to what Vermillion said, the compiler will not optimize a return copy into a move: Return is defined in terms of move (copy is fallback). It might however completely elide the move. – MikeMB Nov 30 '15 at 20:11