Rule of thumb for when passing by value is faster than passing by const reference?

Question

Suppose I have a function that takes an argument of type T. It does not mutate it, so I have the choice of passing it by const reference const T& or by value T:

void foo(T t){ ... }
void foo(const T& t){ ... }

Is there a rule of thumb of how big T should become before passing by const reference becomes cheaper than passing by value? E.g., suppose I know that sizeof(T) == 24. Should I use const reference or value?

I assume that the copy constructor of T is trivial. Otherwise, the answer to the question depends on the complexity of the copy constructor, of course.

I have already looked for similar questions and stumbled upon this one:

template pass by value or const reference or...?

However, the accepted answer ( https://stackoverflow.com/a/4876937/1408611 ) does not state any details,it merely states:

If you expect T always to be a numeric type or a type that is very cheap to copy, then you can take the argument by value.

So it does not solve my question but rather rephrases it: How small must a type be to be "very cheap to copy"?

technically, if it is bigger than `sizeof(T*)`, then it is already inefficient to pass by value(excluding things like creating copies inside the called function) — Creris, Oct 15 '14 at 16:34
@TheOne Not at all. You have to consider the cost of loading the parts of the argument that you use through the pointer you get. This gets messy very quickly, so the best way is to try it out. But for example, if your sole argument is a POD pair of two integers, passing by value probably (depending on the ABI) just uses two registers, while passing by pointer wastes an available register and requires additional instructions inside the function. — , Oct 15 '14 at 16:38
Also consider passing a member of a structure / class when only a member is necessary. — Thomas Matthews, Oct 15 '14 at 16:39
`sizeof(std::vector)` is 12 in some 32bit implementations and 24 in some 64bit implementations, but you don't want to pass it by value, as the *contents* might be much more than that, and even if that's not the case, copy construction requires a memory allocation for any non-empty vector... On another note, this depends a lot on the platform that you are targeting, you should look at the ABI to understand how the function call is implemented. — David Rodríguez - dribeas, Oct 15 '14 at 18:01
A related reading: [Want speed? Pass by value.](http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/) And of course a Google search of the title to read all the supporting points and counterpoints to that article. — Angew is no longer proud of SO, Oct 15 '14 at 20:45
@Angew A good article, but it specifically discusses the case when you *need* a copy (because you're going to mutate it). — , Oct 16 '14 at 06:21
@DavidRodríguez-dribeas: I specified that I assume a trivial copy constructor. `std::vector` would of course be an example with a very untrivial copy constructor, so I would never even think about copying a `vector` unless I really need a copy. — gexicide, Oct 16 '14 at 08:41

score 28 · Accepted Answer · answered Oct 15 '14 at 17:01

If you have reason to suspect there is a worthwhile performance gain to be had, cut it out with the rules of thumb and measure. The purpose of the advise you quote is that you don't copy great amounts of data for no reason, but don't jeopardize optimizations by making everything a reference either. If something is on the edge between "clearly cheap to copy" and "clearly expensive to copy", then you can afford either option. If you must have the decision taken away from you, flip a coin.

A type is cheap to copy if it has no funky copy constructor and its sizeof is small. There is no hard number for "small" that's optimal, not even on a per-platform basis since it depends very much on the calling code and the function itself. Just go by your gut feeling. One, two, three words are small. Ten, who knows. A 4x4 matrix is not small.

Also depends on how often this function is called. Another reason to measure. — Karthik T, Oct 16 '14 at 03:24
*A type is cheap to copy if it has no funky copy constructor and its sizeof is small.* - this could be misleading. The caveat here is that a class could have a "funky copy constructor" without the programmer knowing about it, e.g. `struct A { std::list x; };`. Such a class could have a small `sizeof` and no explicit copy constructor, but it could still be very expensive to copy. — quant, Dec 03 '14 at 21:35

score 14 · Answer 2 · answered Oct 15 '14 at 17:15

Passing a value instead of a const reference has the advantage that the compiler knows the value isn't going to change. "const int& x" doesn't mean the value cannot change; it only means that your code is not allowed to change it by using the identifier x (without some cast that the compiler would notice). Take this awful but perfectly legal example:

static int someValue;

void g (int i)
{
    --someValue;
}

void f (const int& x)
{
    for (int i = 0; i < x; ++i)
        g (i);
}

int main (void)
{
    someValue = 100;
    f (someValue);
    return 0;
}

Inside function f, x isn't actually constant! It changes every time that g (i) is called, so the loop only runs from 0 to 49! And since the compiler generally doesn't know whether you wrote awful code like this, it must assume that x might change when g is called. As a result, you can expect the code to be slower than if you had used "int x".

The same is obviously true for many objects as well that might be passed by reference. For example, if you pass an object by const&, and the object has a member that is int or unsigned int, then any assignment using a char*, int*, or unsigned int* might change that member, unless the compiler can prove otherwise. Passed by value, the proof is much easier for the compiler.

If I were designing a language/framework, I would define a parameter/argument semantic for cases where neither the caller nor the called method was allowed to change the passed item; pass-by-value and pass-by-reference would be equivalent in that case, so a compiler could automatically pick whichever was more efficient. Alternatively, a function could define multiple entry points based upon whether the passed-in object might be modified on the call side or whether the caller would need the object after the call, and have the called method perform a copy if needed. — supercat, Oct 15 '14 at 22:27
D has just that (immutable data). So do many other languages. — Demi, Oct 16 '14 at 00:02
Does it happen only with "awful code" like this, (e.g. using global variables), or does the compiler understand variable scopes and make educated guess about `const`ness in normal cases? — rr-, Oct 16 '14 at 08:27

Drax · Answer 3 · 2014-10-15T16:40:58.520

13

The most appropriate rule of thumb in my opinion is pass by reference when :

sizeof(T) >= sizeof(T*)

The idea behind this is that when you take by reference, at worst your compiler might implement this using a pointer.

This of course doesn't take into account the complexity of your copy constructor and move semantics and all the hell that can be created around your object life cycle.

Also if you don't care about micro optimisations you can pass everything by const reference, on most machines pointer are 4 or 8 bytes, very few types are smaller than that and even in that case you would lose a few (less than 8) bytes copy operation and some indirections which in modern world is most likely not gonna be your bottleneck :)

edited Oct 15 '14 at 16:40

answered Oct 15 '14 at 16:34

Drax

12,682
7
45
85

I wonder if cache locality could somehow affect performance here? – Basilevs Oct 15 '14 at 16:42
15

I don't think this rule has any basis in reality. You have to consider the cost of loading the parts of the argument that you use through the pointer you get. Reasoning about this gets messy very quickly, so the best way is to try it out. If, for example, the sole parameter is a pair of two integers, passing by value probably (depending on the ABI) just uses two registers, while passing by pointer wastes an available register and requires additional instructions inside the function. I'd err on the side of caution, i.e. use a small multiple of the pointer size. – Oct 15 '14 at 16:43
@Basilevs probably but without any measure we can't really discuss that, from instinct (which is officaly a bad way of reasoning regarding performance :)) i'd say it is not "that" big, but again it should be checked – Drax Oct 15 '14 at 16:45
@Basilevs If all else (regalloc, inlining, other optimizations) fails, a reference would point at the original data in the stack, and a copy made for pass-by-value would also be on the stack. The stack, by its nature as everyone's scratchpad, it virtually always in cache. Even if it wasn't in cache beforehand, writing the copy there just before the call brought it into cache. So I'd say we can safely neglect caching. – Oct 15 '14 at 16:48
@delnan I agree with you but since any given number would be implementation dependent and the OP asked for a rule of thumb, i think that's the best rule to keep it simple and not biased, also that rule allows me to stress that most of the time, you don't care that much about the details :) – Drax Oct 15 '14 at 16:48
@Drax No specific number is necessary. Does it feel big? An int isn't big. Three floats aren't big. A 4x4 matrix is big. Anything in between, ask your gut. (BTW your rule is also slightly implementation dependent, since it depends on the sizes of types and on padding.) – Oct 15 '14 at 16:51
2

@delnan Well for some reason i didn't feel like answering a rule of thumb question by "Ask your gut" ^^. Am not sure how padding make this rule implementation dependent, but even if that is the case i still think it is the most appropriate. – Drax Oct 15 '14 at 16:57
1

@Drax Good point. I refined my argument and posted it as an answer. I hope it's a little more productive than "ask your gut" now. – Oct 15 '14 at 17:34

score 4 · Answer 4 · answered Oct 15 '14 at 17:14

I believe I would choose to pass by value whenever possible (that is: when the semantics dictate that I do not need the actual object to work on). I would trust the compiler to perform the appropriate moves and copy-elision.

After my code is semantically correct, I would profile it to see if I am making any unnecessary copies; I would modify those accordingly.

I believe that this approach would help me focus on the most important part of my software: correctness. And I would not get on the way of the compiler---interfere; inhibit---to perform optimizations (I know I cannot beat it).

Having said that, nominally references are implemented as pointers. So in a vaccum, without considering semantics, copy-elisions, move semantics, and stuff like that, it would be more "efficient" to pass by pointer/reference anything whose size is larger than the pointer.

As I explained in multiple comments before, no, being larger than a pointer does not imply pass by reference being more efficient, not even in your vacuum. — , Oct 15 '14 at 17:32

score 3 · Answer 5 · answered Oct 15 '14 at 17:30

For an abstract "T" in an abstract "C++" the rule of thumb would be to use the way that better reflects the intention, which for an argument that isn't modified is almost always "pass by value". Besides, concrete real world compilers expect such an abstract description and gonna pass your T in the most efficient way, regardless of how you do this in the source.

Or, to talk about naivie compilation and composition, "very cheap to copy" is "anything you can load in a single register". Doesn't get any cheaper than that really.

score 3 · Answer 6 · answered May 16 '16 at 22:10

The guys here are correct that most of the time it doesn't matter when the sizeof the struct/class is small and there's no fancy copy constructor. However that's no excuse for being ignorant. Here's what happens in some modern ABIs like x64. You can see that on that platform, a good threshold for your rule of thumb is to pass by value where the type is a POD type and sizeof() <= 16, since it will get passed in two registers. Rules of thumb are good, they keep you from wasting time and limited brainpower on the little decisions like this that don't move the needle.

However, sometimes it will matter. When you've determined with a profiler that you have one of those cases (and NOT before unless it's super obvious!), then you need to understand the low level details - not hear comforting platitudes about how it doesn't matter, which SO is full of. Some things to keep in mind:

Passing by value tells a compiler that the object doesn't change. There are evil kinds of code, involving threads or globals where the thing pointed to by the const reference is being changed somewhere else. Although surely you don't write evil code, the compiler may have a tough time proving that, and have to generate very conservative code as a result.
If there are too many arguments to pass by register, then it doesn't matter what the sizeof is, the object will get passed on the stack.
An object that is small and POD today may grow and gain an expensive copy constructor tomorrow. The person who adds those things probably did not check if it was being passed around by reference or by value, so code that used to perform great may suddenly chug. Passing by const reference is a safer option if you work on a team without comprehensive performance regression tests. You will get bitten by this sooner or later.

score 2 · Answer 7 · answered Oct 15 '14 at 18:23

2

If you're going to use a "rule of thumb" for by-value vs. by-const-reference, then do this:

pick ONE approach and use it everywhere
agree upon which one among all your coworkers
only later, in the "hand-tuning performance" phase, start changing things
and then, only change them if you see a measurable improvement

answered Oct 15 '14 at 18:23

Alex Shroyer

3,499
2
28
54

6

Rules of thumb don't supersede common sense. If it's BLATANTLY OBVIOUS that one approach is better in some situation, there's no need to defer decision making until after a test; and no need to test. Use rules of thumb for minor decisions where the "waste of time" factor is measured in development time rather than CPU time. – Alex Shroyer Oct 15 '14 at 20:24

Rule of thumb for when passing by value is faster than passing by const reference?

7 Answers7

Linked