Void functions that take a reference vs functions that return a value in C++

Question

In C++, is it good practice to initialize a variable by passing a reference to it into an "initialization" function? Or, to put it another way, it is good practice to write functions that behave this way (i.e. update variables created somewhere else)? In my intro programming class (taught in Java), we were taught to write methods like this as static and to give them explicit return values. But I've noticed from looking at a few samples that some C++ programmers declare their variables with no explicit initialization, hand them off to some function, then proceed to use them in the program. Are there any advantages/drawbacks for either style? (I'm excluding purely OO stuff like member functions and variables from this question - this isn't just about using methods to update an object's state. I've seen this done outside of classes in C++).

I wrote a few quick lines of code to illustrate what I mean. The first function genName() is the style I'm familiar with. The second, gen_name() is the kind I'm curious about.

string genName() {
    string s = "Jack" ;
    return s ;
}

void gen_name(string & s) {
    s = "Jill" ;
}

int main(int argc, const char * argv[]) {

    string name1 = genName() ;

    string name2 ;
    gen_name(name2) ;

    cout << name1 << endl ;
    cout << name2 << endl ;

    return 0;
}

IMHO your second example `gen_name(name2);` is not good practice, but this is just my opinion. — Basile Starynkevitch, Dec 07 '13 at 08:02

score 2 · Answer 1 · edited Dec 07 '13 at 08:39

2

The initialization-by-reference style used to be popular for initializing complex data types in C++98 which didn't provide the move constructor, and where return value optimization was not yet ubiquitously implemented.

For example, a function that creates and returns a large vector would be frowned-upon because it would in effect create a temporary vector, which would (lacking a compiler that reliably implements RVO) be copied to the target vector, along with all of its elements. This unnecessary local allocation and copying led some programmers and style guides to recommend initialization by reference style everywhere. Modern C++ addresses this complaint with the move constructor and std::move, so the initialization by reference pattern can be retired.

edited Dec 07 '13 at 08:39

JoeG

12,994
1
38
63

answered Dec 07 '13 at 08:06

user4815162342

141,790
18
296
355

This makes it sound like pass by reference is a good idea before C++11, which in most cases it isn't. – juanchopanza Dec 07 '13 at 08:18
Constructors have been a pain coming from Java. It's weird having to write a new constructor just to assign an object to a different variable. It should be the compiler's job to figure out how to do it. – AdamJames Dec 07 '13 at 08:24
@juanchopanza The answer doesn't claim that all pass-by-reference was preferred pre-C++11, only those where copying is a problem, which is typically for code that returns (potentially) large containers. – user4815162342 Dec 07 '13 at 08:25
It is not preferred even then. It's been a long time since RVO has worked on popular compilers such as g++, VS and clang. – juanchopanza Dec 07 '13 at 08:28
@juanchopanza Whether RVO worked would depend on optimization level and sometimes details of the compiler's [analysis of a concrete function](http://en.wikipedia.org/wiki/Return_value_optimization#Compiler_support). Because of this (and presumably support of legacy compilers) relying on RVO was in my experience often discouraged. – user4815162342 Dec 07 '13 at 13:05

mpark · Answer 2 · 2013-12-07T09:07:17.143

The reason why the second option used to be popular is because of the overhead of expensive copying of objects such as std::string, std::map, etc. These objects, if copied, have the overhead of not only deep-copying the elements but also heap-allocations which can be expensive.

Having said that, with C++11 a lot of this goes away thanks to move semantics, and it allows us to do a few things that we couldn't do before.

For example, if you wanted your name to be a const object, this can be useful.

const std::string name = []() {
  std::string name;
  /* Fill in name. */
  return name;
}();

However, do note that initialize by reference is still useful in some cases. For example, the following code:

for (int i = 0; i < N; ++i) {
  const std::string name = gen_name(i);
  /* Use name here. */
}  // for

Even though it'd be nice to add the const if we know that we won't be modifying it, in terms of performance, the following would be faster.

std::string name;
for (int i = 0; i < N; ++i) {
  gen_name(i, name);
  /* Use name here. */
}  // for

EDIT:

The reason why I point out that initialization by reference may be preferred in some cases is because sometimes we can reuse a resource we acquired in a loop. In the above example, rather than constructing a new instance std::string on every iteration which would lead to a heap-allocation on every iteration, we can simply do a single heap-allocation at the beginning and keep reusing the same space.

What's the difference between the two parts of your answer? One part says "it's good", the other part says "it's bad", but the circumstances in both seem the same. — anatolyg, Dec 07 '13 at 08:44
I edited the answer to include the explanation for that. I hope it helps :) — mpark, Dec 07 '13 at 09:08
The point of the lambda hack is using RVO to avoid creating a copy? Otherwise it's not obvious why the lambda is needed instead of the simpler `const std::string const_name = name;`. — user4815162342, Dec 09 '13 at 13:04
@user4815162342 well, I don't want to put the logic for building the string inside of top-level function if I'm only going to use the function once. I'd also rather not build up the string and assign it like so: `const std::string name = name;` because I don't want `name` and `const_name` both in scope. — mpark, Dec 09 '13 at 18:06

Joop Eggen · Accepted Answer · 2013-12-07T08:36:13.340

0

Passing by reference makes the code less readable - and code is more read than written.

The most important reason in C++ is that in Java every object is a reference, so results are cheap. In C++ an entire struct is returned possibly. A minor copying overhead.

But there are advantages to reference parameters:

Multiple results that would otherwise need an extra result type as container for several result values.

void divideAndRemainder(int p, int q, int& d, int& r)

Multiple results that would need preparation of input too.

void swapVariables(int& a, int& b);

Aliasing of fields filling either this or that field/variable.

struct link {
    struct link* next;
    int value;
}

// Ordered list insert, not possible like this in Java:
// Read "struct link*&", but for clarity I use explicit dereferencing here:
// *list.
void insert(struct link** list, int value) {
    while (*list && value < (*list)->value) {
        list = &(*list)->next;
    }
    struct link* next = *list;
    *list = new struct link();
    (*list)->value = value;
    (*list)->next = next;
}

(Mind - I am now an ingrained Java programmer.)

edited Dec 07 '13 at 08:36

answered Dec 07 '13 at 08:26

Joop Eggen

107,315
7
83
138

Under normal circumstances, there is no copy made when returning by value, so there is no overhead to mitigate in the first place. – juanchopanza Dec 07 '13 at 08:39
@juanchopanza: though I was fully aware of that, thanks for clarifying as my formulation "minor copying overhead" is too unprecise. – Joop Eggen Dec 07 '13 at 08:54
Giving it to you for the "multiple results" angle, which I'd never considered. The first time I ever tried to write an OO program for homework I wondered why there aren't functions that can return multiple values. I realized later that was a stupid thought. I could see the updating multiple things at once idea coming in useful though. – AdamJames Dec 07 '13 at 09:07
3

My preferred way to handle multiple results in C++11 is to return a `std::tuple<>` and assign it to a `std::tie()`, like so: `std::tuple divide(int x, int y);` then do: `std::tie(quotient, remainder) = divide(x, y);`. – mpark Dec 07 '13 at 09:14

Void functions that take a reference vs functions that return a value in C++

3 Answers3