15

Have a look at this hypothetical header file:

template <class T>
class HungryHippo {
public:
    void ingest(const T& object);
private:
    ...
}

Now, for a HungryHippo<string> it makes sense that you would want to ingest references to the strings -- copying a string might be very expensive! But for a HungryHippo<int> it makes way less sense. Passing an int directly can be really cheap (most compilers will do it in a register), but passing a reference to an int is an extra needless level of indirection. This all applies to returning values as well.

Is there some way to suggest to the compiler "hey, I'm not going to modify the argument, so you decide whether to pass by value or by reference, depending on what you think is better"?

Some things that may be relevant:

  • I can fake this effect manually by writing template <class T, bool PassByValue> class HungryHippo and then specializing on PassByValue. If I wanted to get really fancy, I could even infer PassByValue based on sizeof(T) and std::is_trivially_copyable<T>. Either way, this is a lot of extra work when the implementations are going to look pretty much the same, and I suspect the compiler can do a much better job of deciding whether to pass by value than I can.
  • The libc++ project seems to solve this by inlining a lot of functions so the compiler can make the choice one level up, but in this case let's say the implementation of ingest is fairly complicated and not worth inlining. As explained in the comments, all template functions are inline by default.
Calvin
  • 2,872
  • 2
  • 21
  • 31
  • 4
    Something to consider: this sounds like a good case of premature optimization. Have you run into performance issues because of this architecture? If not, you should probably just do whatever makes sense semantically. – Adam Maras Nov 16 '12 at 00:09
  • 1
    For integers, passing by reference is nearly as performant as passing by value, so const reference is good for both cases. – Sergey Kalinichenko Nov 16 '12 at 00:09
  • 1
    "in this case let's say the implementation of ingest is fairly complicated and not worth inlining." You don't have a choice in this case. Since HungryHippo is a template, it must be inlined. – cgmb Nov 16 '12 at 00:09
  • @AdamMaras I have no need for an answer. This is a purely academic exercise. – Calvin Nov 16 '12 at 00:10
  • Though it doesn't directly address your question, I would be surprised if const-reference passed primitives aren't just copied when they're passed (behind the scenes of course). You're likely just over thinking this and should just pass everything by const reference. If you don't need a copy of the parameter, what is advantage of *not* passing by const ref? – Corbin Nov 16 '12 at 00:10
  • 1
    @Slavik81 That is not true at all. – Calvin Nov 16 '12 at 00:11
  • @Corbin Prepare to be surprised: http://stackoverflow.com/questions/2043974/do-c-compilers-optimize-pass-by-const-reference-pod-parameters-into-pass-by-co – Calvin Nov 16 '12 at 00:12
  • @Calvin I don't mean primitive as in POD. I'm not sure the proper term for it, but I mean primitive as built in, word-sized types (or less than a word) like int, float, char, etc. Passing by const reference is typically implemented by passing a pointer. That means that the pointer must be copied to a register. Why not just copy an int to a register? It's basically the same thing. I'd be a bit surprised if, for this reason, compilers don't just copy any 'primitive' type. It would be better than having to load based on a pointer. – Corbin Nov 16 '12 at 00:16
  • @Calvin Also, I just realized that my first comment was super rambly and was saying two different things. In short, I think you should just copy everything by const reference unless you need a copy of it. The other part of that comment was just a poorly worded, semi-related side note that primitives are probably copied anyway. – Corbin Nov 16 '12 at 00:18
  • @Corbin You are probably right about just always passing by reference. That sounds like good practice to me. Regardless, the original question still stands. And regarding the compiler just copying primitives: to keep binary compatibility between object files the compiler cannot optimize the reference away. This has been the case in all my tests on gcc and clang (although I read somewhere that the Microsoft compiler sometimes makes that optimization, incorrectly). – Calvin Nov 16 '12 at 00:20
  • 1
    I wonder why compilers (and ABI specifications) don't just treat `const int &` the same as `int`. (And the same for all built-in types where it's more efficient to pass by value than by const reference.) Would it break anything? – David Schwartz Nov 16 '12 at 00:22
  • 1
    @Calvin How can you use HungryHippo::ingest for an arbitrary type if the template definition is not available in the same translation unit? – cgmb Nov 16 '12 at 00:22
  • @DavidSchwartz It still would. Let's say f takes a `const int&` and a `int*`, writes "5" to the target of the `int*`, and then prints the `const int&`. Now I say `x=0` and pass it `x` and `&x`. Expected output is 5, but if you optimize it that way the output is 0. The "const" on a reference says "I won't change the value through this variable" but it doesn't say I can't write to it through something else. – Calvin Nov 16 '12 at 00:25
  • @Slavik81 AFAIK in gcc and clang, `HungryHippo::ingest` and `HungryHippo::ingest` will each be generated as separate symbols in all the translation units they are referenced from. If they appear in multiple object files then at link time they are merged. The names are mangled, and you get two different non-inline function calls in the final output. – Calvin Nov 16 '12 at 00:27
  • 1
    @Calvin What you've described is identical to what might be done for a function marked as 'inline'. You can imagine every template function definition is implicitly marked with the 'inline' keyword, unless it has been fully specialized. Whether the compiler actually performs inline substitution is something entirely separate from whether the function is inline. – cgmb Nov 16 '12 at 01:21
  • @Slavik81 I see! As usual, I was mistaken. Here it seems I was wrong about the definition of inline in C++; it is not just a hint to the compiler about whether to inline the function, it is also a linker visibility. And of course, it is the default linker visibility for template functions. There's more info here: http://stackoverflow.com/questions/10535667/does-it-make-any-sense-to-use-inline-keyword-with-templates – Calvin Nov 16 '12 at 01:56
  • 1
    It's not even really a hint to the compiler. It's entirely a linker keyword now, because the optimizer completely ignores whether you say "inline" or not in deciding what to inline. Even compilers with a __force_inline kind of keyword don't always listen to you anymore. – David Stone Nov 16 '12 at 18:21

2 Answers2

8

The boost::call_traits header deals with exactly this issue. Check it out here.

Specifically, the call_traits<T>::param_type option includes the following description:

If T is a small built in type or a pointer, then param_type is defined as T const, instead of T const&. This can improve the ability of the compiler to optimize loops in the body of the function if they depend upon the passed parameter, the semantics of the passed parameter is otherwise unchanged (requires partial specialization).

In your case, you could define ingest as follows:

template <class T>
class HungryHippo {
public:
    void ingest(call_traits<T>::param_type object);
    // "object" will be passed-by-value for small 
    // built-in types, but passed as a const reference 
    // otherwise
private:
    ...
};

Whether this would actually make much of a difference in your actual code/compiler combination, I'm not sure. As always, you'd have to run some actual benchmarks and see what happens...

Darren Engwirda
  • 6,915
  • 4
  • 26
  • 42
  • 2
    As usual, the Boost people have thought of this. :D Looks like the short answer is "no, there's no way to coax the compiler to do this for you" and the long answer is "well, Boost devised a way to get around that problem in some cases." Thanks! – Calvin Nov 16 '12 at 17:13
0

While tricks such as boost's mentioned call_traits<T> do what they claim to do in this case, I think you're assuming that the compiler is not already making this optimization in the most important cases. It is trivial, after all. If you accept a const T& and sizeof(T) <= sizeof(void*), the invariants imposed by C++ reference semantics allow the compiler to simply substitute the value throughout your function body if it's a win. If it's not, your worst-case overhead is one pointer-to-arg dereference in the function prologue.

Matthew Hall
  • 605
  • 3
  • 7
  • 2
    I don't believe that what you say about "C++ reference semantics allow the compiler to simply substitute the value throughout your function body". Consider http://pastebin.com/xqyrQtxW ... By your rules the output would be 10, but by the _real_ rules the output _is_ 5. You *cannot* replace const X& with X. As I said in an earlier comment, "const" means you cannot change the value, not that the value cannot change. – Calvin Nov 16 '12 at 02:05
  • 1
    @Calvin Well in fact Matthew Hall is right. Function can take X instead of const X& but in some specific conditions - eg. when you cannot determine address of X (it is rvalue), then you can still pass it to the function taking const X& and this works. See example https://ideone.com/6KCeve <- here you cannot take address of 20 and you cannot create lreference to it as it is rvalue. – DawidPi Feb 10 '16 at 15:57
  • 1
    Yes, there are cases where this optimization is legal. However, since it is not *always* legal (see my example), you will pretty much never see a compiler do it. More commonly this happens if the procedure is inlined and the reference can be eliminated after inlining. – Calvin Feb 10 '16 at 22:11