18

I was given some code in which some of the parameters are pointers, and then the pointers are dereferenced to provide values. I was concerned that the pointer dereferencing would cost cycles, but after looking at a previous StackOverflow article: How expensive is it to dereference a pointer?, perhaps it doesn't matter.

Here are some examples:


bool MyFunc1(int * val1, int * val2)
{
    *val1 = 5;
    *val2 = 10;
    return true;
}

bool MyFunc2(int &val1, int &val2)
{
    val1 = 5;
    val2 = 10;
    return true;
}

I personally prefer the pass-by-reference as a matter of style, but is one version better (in terms of process cycles) than another?

Community
  • 1
  • 1
Jenner
  • 189
  • 1
  • 3
  • 12
    This is the kind of macro optimization you should __NOT__ worry about. It is the compilers job to produce the best code (it takes into account many more factors than your average human could, only the absolute best assembley writters are going to challenge the compiler). It is your job (as a developer) to come up with the best algorithm to perform a task (as the compiler can't do that). As for style people would argue both ways (I prefer reference as I don't need to check for NULL). – Martin York Oct 30 '09 at 16:29
  • ++ Nobody should have a rep of just 1 :-) – Mike Dunlavey Oct 30 '09 at 16:31
  • n this case (short function) it might be worth using inline. Since you are concerned about performance I'm assuming you call this little function (or something similar) many many times in a short time. – sellibitze Oct 30 '09 at 16:41
  • 5
    @Martin: you mean "micro optimization" - the "macro" ones are the big ones that one should worry about... – hjhill Oct 30 '09 at 17:20
  • Note that you should use pointers if backward compatibility with C could be required. Otherwise I agree with Loki. – HughHughTeotl Apr 27 '14 at 15:27
  • Does this answer your question? [Pointer vs. Reference](//stackoverflow.com/q/114180/90527) – outis Jul 25 '22 at 06:58

11 Answers11

21

My rule of thumb is to pass by pointer if the parameter can be NULL, ie optional in the cases above, and reference if the parameter should never be NULL.

dma
  • 1,758
  • 10
  • 25
  • Right. This kind of micro-optimization should not be attempted unless profiling shows it's necessary. Until then, prefer a good engineering. I happen to agree with this one. Unfortunately, I can't give it 3 up-votes. – sbi Oct 30 '09 at 16:38
  • 5
    My personal preference is to always use references, and when you have a "no reference" case, use `boost::optional` (yes, it is specifically written to work well with references!), given you "optional references". The advantage is that you don't get any of the potentially nasty pointer casts, and you don't get pointer arithmetic (so you cannot mistype `*a+1` as `a+1` and have it compile and do something you didn't expect it to do). – Pavel Minaev Oct 30 '09 at 17:16
14

From a performance point of view, it probably doesn't matter. Others have already answered that.

Having said that, I have yet not found a situation where an added instruction in this case would make a noticeable difference. I do realize that for a function that is called billions of times, it could make a difference. As a rule, you shouldn't adapt your programming style for these kind of "optimizations".

Mattias Nilsson
  • 3,639
  • 1
  • 22
  • 29
11

You can get the assembly code from your compiler and compare them. At least in GCC, they produce identical code.

Lukáš Lalinský
  • 40,587
  • 6
  • 104
  • 126
  • Of course, you can. But should you? Engineering should come first as long as it isn't shown that it considerably hinders performance. So IMO this question should be answered from en engineering POV. (I like http://stackoverflow.com/questions/1650792/1650849#1650849.) – sbi Oct 30 '09 at 16:43
  • 2
    From what I've understood, the question was about "process cycles", not about functional differences between the two options. – Lukáš Lalinský Oct 30 '09 at 16:53
  • 1
    The question pretty clearly states that Jenner prefers it the other way, but is concerned about speed. Why shouldn't he be told that the concern is misplaced? – sbi Oct 31 '09 at 06:06
  • 1
    Didn't I tell him that? There might be huge difference between them and there might be none. What if there was some magic behind passing by reference? What if it was 1000x slower? You wouldn't know until you try/read/ask. I believe it's important to know low-level semantics of the language you are using, so that you can make the right decisions. It's like arguing which word to use in a natural language, without knowing what the words actually mean. – Lukáš Lalinský Oct 31 '09 at 09:18
  • "Didn't I tell him that?" Not in your answer, anyway. "What if there was some magic behind passing by reference?" Then we'd be talking about semantics, not performance. "What if it was 1000x slower?" If so, looking at the assembler code wouldn't be necessary. – sbi Nov 01 '09 at 19:10
5

This will get voted down since it's Old Skool, but I often prefer pointers since it's easier to just glance at code and see if my objects that I am passing to a function could get modified, especially if they are simple datatypes like int and float.

Jim Buck
  • 20,482
  • 11
  • 57
  • 74
  • While I happen to not to be in this camp (I subscribe this this argument: http://stackoverflow.com/questions/1650792/1650849#1650849), I see it as a valid argument. +1 – sbi Oct 30 '09 at 16:41
  • Oops, I only noticed your reply after adding my own (which basically repeats your argument). +1 for you. – Frerich Raabe Oct 30 '09 at 16:49
  • I would amend that as: it's easier to see, _while reading code with the function call_, that function call may modify some of its arguments. I.e. `swap(a, b)` vs `swap(&a, &b)`. – Pavel Minaev Oct 30 '09 at 17:17
  • I agree - http://stackoverflow.com/questions/334856/are-there-benefits-of-passing-by-pointer-over-passing-by-reference-in-c/334944#334944 – Michael Burr Oct 30 '09 at 17:26
  • 1
    Nah:int* test = new int(5); // 10 lines of code: swap(test, &b); – deworde Dec 12 '12 at 09:32
5

There are different guidelines on using reference vs. pointer parameters out there, tailored to different requirements. In my opinion, the most meaningful rule that should be applied in generic C++ development is the following:

  1. Use reference parameters when overloading operators. (In this case you actually have no choice. This is what references were introduced for in the first place.)

  2. Use const-reference for composite (i.e. logically "large") input parameters. I.e input parameters should be passed either by value ("atomic" values) or by const-reference ("aggregate" values). Use pointers for output parameters and input-output parameters. Do not use references for output parameters.

Taking the above into the account, the overwhelming majority of reference parameters in your program should be const-references. If you have a non-const reference parameter and it is not an operator, consider using a pointer instead.

Following the above convention, you'll be able to see at the point of the call whether the function might modify one of its arguments: the potentially modified arguments will be passed with explicit & or as already-existing pointers.

There's another popular rule out there that states that something that can be null should be passed as a pointer, while something that can't be null should be passed as a reference. I can imagine that this might make sense in some very narrow and very specific circumstances, but in general this is a major anti-rule. Just don't do it this way. If you want to express the fact that some pointer must not be null, put a corresponding assertion as the very first line of your function.

As for the perfromance considerations, there's absolutely no performance difference in passing by pointer or passing by reference. Both kinds of parameters are exactly the same thing at the physical level. Even when the function gets inlined, a modern compiler should be smart enough to preserve the equivalence.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
3

Here's the difference in the generated assembly with g++. a.cpp is pointers, b.cpp is references.

$ g++ -S a.cpp

$ g++ -S b.cpp

$ diff a.S b.S
1c1
<       .file   "a.cpp"
---
>       .file   "b.cpp"
4,6c4,6
< .globl __Z7MyFunc1PiS_
<       .def    __Z7MyFunc1PiS_;        .scl    2;      .type   32;     .endef
< __Z7MyFunc1PiS_:
---
> .globl __Z7MyFunc1RiS_
>       .def    __Z7MyFunc1RiS_;        .scl    2;      .type   32;     .endef
> __Z7MyFunc1RiS_:

Just the function name is slightly different; the contents are identical. I had identical results when I used g++ -O3.

Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
1

From a performance perspective, any competent compiler should wipe out the issue, which is highly unlikely to be a bottleneck in any case. If you're genuinely working at that low a level, assembly code analysis and performance profiling on realistic data are going to be essential parts of your toolkit anyway.

From a maintenance perspective, you really shouldn't allow someone to pass in parameters as pointers without checking them for nullity.

So you end up writing a ton of null check code, for no good reason.

Basically it goes from:

  1. Create object
  2. Pass in as non-const reference

To:

  1. Create object
  2. Get address of object
  3. Pass in address
  4. Check if address points to null
  5. Dereference back to a non-const reference for use throughout the function

While the compiler will parse all of that junk out, the reader of the code won't. It'll be extra code to consider when extending the functions or working out what the code does. Also more code is written, which increases the number of lines that can contain a bug. And as it's pointers, the bugs are more likely to be quirky undefined behaviour bugs.

It also encourages weaker programmers to write code like this:

int* a = new int(4); // Don't understand why this has to be a pointer
int* b = new int(5); // Don't understand why this has to be a pointer
MyFunc2(a, b);
int& a_r = *a;
int& b_r = *b;

Yes, that's terrible, but I've seen it from new coders who don't really understand the pointer model.

At the same time, I'd argue that the whole "I can see whether it's going to be modified without looking at the actual header", is a bit of a false advantage, considering the potential losses. If you can't immediately tell which are the output parameters from the context of the code, then that's your problem, and no amount of pointers are going to save you. If you must have an & to identify your output parameters, may I introduce:

MyFunc2(/*&*/a, /*&*/b)

All the "readability" of the ampersand, none of the associated pointer risks.

However, when it comes to maintenance, consistency is king. If there is existing code that you're integrating with, that passes as pointers (e.g. the other functions of the class or library), there's no good reason to be a crazy rebel and go your own way. That's really going to cause confusion.

deworde
  • 2,679
  • 5
  • 32
  • 60
1

References are very similar to pointers with one big difference: references can not be NULL. So you no not need to check if they are acutual usable objects (like for pointers).

Therefore I assume that compilers will produce the same code.

fmuecke
  • 8,708
  • 1
  • 20
  • 25
  • Be careful with this, though, since references can actually be NULL. You shouldn't check for it, of course, but keep in mind that your code could still crash on a NULL reference (in case it matters for your application). – Jim Buck Oct 30 '09 at 16:26
  • TTBOMK, a reference can only be `NULL` if someone maliciously invoked undefined behavior in order to set a reference to `NULL`. If you want to be careful because of _that_ possibility, you must not trust anything else in your code either. – sbi Oct 30 '09 at 16:40
  • @Jim: References cannot be NULL in a valid program. Sure, you can perform a nasty hack to make it happen, but at that point the program is allowed to turn your screen blue and summon Zeus to sleep with your cousins, so a reference being NULL is the least of your problems. – Kaz Dragon Oct 30 '09 at 16:40
  • @Kaz: That might depend on your cousins. `:)` – sbi Oct 30 '09 at 16:45
  • References can't be null in general, but the case to watch out for is where you've stored a reference to a variable that has gone out of scope. For example, returning a reference to a local variable in a function. – mch Oct 30 '09 at 16:53
  • If you have a lot of pointer-based code, and you need to interface with a library that uses references, you have to use the dereference operator to call into that library. If your pointer-based code can have NULLs, then when you dereference, you will most certainly have a NULL reference. Sure, the result is undefined, but so is accessing NULL pointers in code that is only pointer-based. – Jim Buck Oct 30 '09 at 17:27
  • @Jim: But dereferencing a pointer without checking for `NULL` _is_ malicious. – sbi Oct 30 '09 at 23:25
1

All the other answers already point out that neither function is superior to the other in terms of runtime performance.

However, I think that that the former function is superior to the other in terms of readability, because a call like

f( &a, &b );

clearly expresses that a reference to some variable is passed (which, to me, rings the 'this object might be modified by the function' bell). The version which takes references instead of pointers just looks like

f( a, b );

It would be fairly surprising to me to see that a changed after the call, because I cannot tell from the invocation that the variable is passed by reference.

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207
0

You should see the generated assembly code for the target machine... take into account that a function call is always done in constant time, and on actual machines that time is really negligible...

JPCF
  • 2,232
  • 5
  • 28
  • 50
0

If you need to do things like bellow to remember that "a" is going to be changed on the function. What will stop you from inverting parameters that have same type or making some nasty error, when you have to call some function you must keep the prototype on your memory, or use some IDE that show it on a tooltip. There no excuse for that, I don't like references because I cant know if the function will change it, is not an valid argument.

MyFunc2(/*&*/a, /*&*/b)
Luiz Felipe
  • 1,123
  • 8
  • 14