47

Let's consider an object foo (which may be an int, a double, a custom struct, a class, whatever). My understanding is that passing foo by reference to a function (or just passing a pointer to foo) leads to higher performance since we avoid making a local copy (which could be expensive if foo is large).

However, from the answer here it seems that pointers on a 64-bit system can be expected in practice to have a size of 8 bytes, regardless of what's being pointed. On my system, a float is 4 bytes. Does that mean that if foo is of type float, then it is more efficient to just pass foo by value rather than give a pointer to it (assuming no other constraints that would make using one more efficient than the other inside the function)?

Community
  • 1
  • 1
space_voyager
  • 1,984
  • 3
  • 20
  • 31
  • 9
    You should measure it. The size of the thing being referenced/copied isn't the only thing that comes into play. – juanchopanza Oct 21 '16 at 21:30
  • http://stackoverflow.com/questions/21605579/how-true-is-want-speed-pass-by-value – Humam Helfawi Oct 21 '16 at 21:31
  • 5
    In short: It is almost always more efficient to pass native types (int, float, double) by value than by reference. Not only because a pointer is - in most cases - bigger or as big as the native datatype, but also because it is much harder for the optimizer to optimize reference parameters than value parameters. – MikeMB Oct 21 '16 at 22:07
  • 1
    This is unanswerable. The c++ standard tells nothing about this cost. Different compilers have different optimizations. Any of these might be without cost. – Captain Giraffe Oct 21 '16 at 22:11

6 Answers6

43

There is one thing nobody mentioned.

There is a certain GCC optimization called IPA SRA, that replaces "pass by reference" with "pass by value" automatically: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (-fipa-sra)

This is most likely done for scalar types (eg. int, double, etc), that does not have non-default copy semantics and can fit into cpu registers.

This makes

void(const int &f)

probably as fast (and space optimized)

void(int f)

So with this optimization enabled, using references for small types should be as fast as passing them by value.

On the other hand passing (for example) std::string by value could not be optimized to by-reference speed, as custom copy semantics are being involved.

From what I understand, using pass by reference for everything should never be slower than manually picking what to pass by value and what to pass by reference.

This is extremely useful especially for templates:

template<class T>
void f(const T&)
{
    // Something
}

is always optimal

peku33
  • 3,628
  • 3
  • 26
  • 44
  • Does this optimization apply to forwarding references? If so, wouldn't it follow that `template void f(T&&) { ... }` is more optimal in the general case? – Christopher Mauer Feb 15 '21 at 19:17
  • Since this isn't tagged GCC, are there equivalent optimizations for other compilers (most importantly Clang and Visual C++)? – Spencer Nov 24 '21 at 14:17
  • Hm, I don't see it working https://gcc.godbolt.org/z/rz9TEdbzd And it's a little strange in a sense this optimization can only work correctly if all code using this declaration is compiled with this optimization – Mikhail Sep 05 '22 at 16:06
39

It depends on what you mean by "cost", and properties of the host system (hardware, operating system) with respect to operations.

If your cost measure is memory usage, then the calculation of cost is obvious - add up the sizes of whatever is being copied.

If your measure is execution speed (or "efficiency") then the game is different. Hardware (and operating systems and compiler) tend to be optimised for performance of operations on copying things of particular sizes, by virtue of dedicated circuits (machine registers, and how they are used).

It is common, for example, for a machine to have an architecture (machine registers, memory architecture, etc) which result in a "sweet spot" - copying variables of some size is most "efficient", but copying larger OR SMALLER variables is less so. Larger variables will cost more to copy, because there may be a need to do multiple copies of smaller chunks. Smaller ones may also cost more, because the compiler needs to copy the smaller value into a larger variable (or register), do the operations on it, then copy the value back.

Examples with floating point include some cray supercomputers, which natively support double precision floating point (aka double in C++), and all operations on single precision (aka float in C++) are emulated in software. Some older 32-bit x86 CPUs also worked internally with 32-bit integers, and operations on 16-bit integers required more clock cycles due to translation to/from 32-bit (this is not true with more modern 32-bit or 64-bit x86 processors, as they allow copying 16-bit integers to/from 32-bit registers, and operating on them, with fewer such penalties).

It is a bit of a no-brainer that copying a very large structure by value will be less efficient than creating and copying its address. But, because of factors like the above, the cross-over point between "best to copy something of that size by value" and "best to pass its address" is less clear.

Pointers and references tend to be implemented in a similar manner (e.g. pass by reference can be implemented in the same way as passing a pointer) but that is not guaranteed.

The only way to be sure is to measure it. And realise that the measurements will vary between systems.

Peter
  • 35,646
  • 4
  • 32
  • 74
  • 4
    Do you know an actual example of an architecture, where passing a smaller type(e.g. a char) is more expensive than passing a bigger type (like an int or pointer)? – MikeMB Oct 21 '16 at 22:10
  • Yeah, okay, a couple of examples added. – Peter Oct 21 '16 at 22:36
  • Thanks, but is any of those examples relevant for the question of pass by pointer/reference vs pass by value? After all, it is not about passing a float vs passing a double. – MikeMB Oct 21 '16 at 22:56
  • 1
    The term "operations" includes, but is not limited to, copying a value. The point is that passing something smaller is not necessarily more "efficient" than passing something larger. Which is generally the type of efficiency argument cited for passing a pointer (or a reference) versus value. – Peter Oct 21 '16 at 23:10
  • The answer is not correct: On a 64 bit system even the smaller types are always aligned to 64 bits in memory. That means that a single char, for instance, is not stored at an *arbitrary* byte address but always at multiples of 8 bytes. Thus dealing with smaller types is **not** more expensive than dealing with 64 bit types. They have the same cost on a 64 bit system. – mschoenebeck Feb 17 '23 at 00:34
  • @mschoenebeck There are more systems in the real world than the 64 bit systems you mention. On other systems, what is said is true. And there is already a note in this answer about systems where copying smaller types is not more expensive - which includes your 64 bit systems. – Peter Feb 17 '23 at 00:41
  • @Peter can you name one such architecture? Because this is certainly not the case for any x86 or arm platform I have ever worked with. So for 99.9% of all machines there is no additional cost to deal with primitives smaller than the bus width. – mschoenebeck Feb 17 '23 at 15:59
  • @mschoenebeck 80386 were among the early 32-bit CPUs I alluded to in my answer. I'm not sure offhand which later generation IA-32 processor was the first to alleviate those penalties. – Peter Feb 18 '23 at 01:57
8

You must test any given scenario where performance is absolutely critical, but be very careful about trying to force the compiler to generate code in a specific way.

The compiler's optimizer is allowed to re-write your code in any way it chooses as long as the final result is the provably same, which can lead to some very nice optimizations.

Consider that passing a float by value requires making a copy of the float, but under the right conditions, passing a float by reference could allow storing the original float in a CPU floating-point register, and treat that register as the "reference" parameter to the function. By contrast, if you pass a copy, the compiler has to find a place to store the copy in order to preserve the contents of the register, or even worse, it may not be able to use a register at all because of the need for preserving the original (this is especially true in recursive functions!).

This difference is also important if you are passing the reference to a function that could be inlined, where the reference may reduce the cost of inlining since the compiler doesn't have to guarantee that a copied parameter cannot modify the original.

The more a language allows you to focus on describing what you want done rather than how you want it done, the more the compiler is able to find creative ways of doing the hard work for you. In C++ especially, it is generally best not to worry about performance, and instead focus on describing what you want as clearly and simply as possible. By trying to describe how you want the work done, you will just as often prevent the compiler from doing its job of optimizing your code for you.

Matt Jordan
  • 2,133
  • 9
  • 10
  • 2
    Usually it is the other way round: When you pass a parameter by reference/pointer, then - in practice - that parameter always has to be written to memory, whereas passing it by value sometimes allows keeping the data in registers. – MikeMB Oct 21 '16 at 22:16
  • @MikeMB - this is not the case in the scenario I presented above, where the original copy is stored in a register; passing by value requires a different copy in order to preserve the contents of the original, so either an additional register must be used if one is available, or the entire register optimization has to be unrolled into memory due to too few registers. by contrast, passing by reference could allow the compiler to share the same register across both pieces of code (especially if the function is inlined). I don't claim this is a common scenario, but certainly possible. – Matt Jordan Nov 07 '16 at 20:37
  • 6
    Assuming no function inlining takes place. Then pass by reference means - on the calling conventions I'm aware of - that a pointer to the original memory location **HAS** to be passed to the function and that requires the value to be actually stored in memory, as a pointer can't point to a register. When passing by value, you might have to copy the falue from one register to another (not if the value isn't used after the function call) but you don't have to store it in memory. – MikeMB Nov 07 '16 at 22:26
7

Does that mean that if foo is of type float, then it is more efficient to just pass foo by value?

Passing a float by value could be more efficient. I would expect it to be more efficient - partly because of what you said: A float is smaller than a pointer on a system that you describe. But in addition, when you copy the pointer, you still need to dereference the pointer to get the value within the function. The indirection added by the pointer could have a significant effect on the performance.

The efficiency difference could be negligible. In particular, if the function can be inlined and optimization is enabled, there is likely not going to be any difference.

You can find out if there is any performance gain from passing the float by value in your case by measuring. You can measure the efficiency with a profiling tool.

You may substitute pointer with reference and the answer will still apply equally well.

Is there some sort of overhead in using a reference, the way that there is when a pointer must be dereferenced?

Yes. It is likely that a reference has exactly the same performance characteristics as a pointer does. If it is possible to write a semantically equivalent program using either references or pointers, both are probably going to generate identical assembly.


If passing a small object by pointer would be faster than copying it, then surely it would be true for an object of same size, wouldn't you agree? How about a pointer to a pointer, that's about the size of a pointer, right? (It's exactly the same size.) Oh, but pointers are objects too. So, if passing an object (such as a pointer) by pointer is faster than copying the object (the pointer), then passing a pointer to a pointer to a pointer to a pointer ... to a pointer would be faster than the progarm with less pointers that's still faster than the one that didn't use pointers... Perhap's we've found an infinite source of efficiency here :)

eerorika
  • 232,697
  • 12
  • 197
  • 326
1

Always prioritize pass by reference than pointers if you want an optimized execution time to avoid random access. For pass by references vs by value, the GCC optimize your code such that small variable that do not need to be changed will be passed by value.

Ilyes
  • 581
  • 1
  • 5
  • 12
1

Can't believe that no one brought up the correct answer yet.

On a 64 bit system passing 8 bytes or 4 bytes has exactly the same cost. The reason for this is that the data bus is 64 bit wide (which is 8 bytes) and thus even if you pass only 4 bytes - it doesn't make a difference for the machine: The data bus is 8 bytes wide.

The cost only increases if you want to move more than 64 bit. Everything equal or below 64 bits comes at the same number of clock cycles.

mschoenebeck
  • 377
  • 3
  • 12