0

I have found many people saying that passing a primitive value, such as an int, by value, is faster than passing by reference.

However, I wrote the code below, and passing by value, and it averages 2 second run time.

#include <iostream>
#include <chrono>

using namespace std;
using namespace chrono;

int func(unsigned long long x) {
  ++x;
  return x;
}

int main()
{
  for (int b = 0; b < 5; b++) {
    
    unsigned long long x = 0;
  
    auto start = high_resolution_clock::now();
    for (long long j = 0; j < 500000; j++) {
      for (long long i = 0; i < 180000000; i++) {
        x = func(x);
      }
    }
  
    auto stop = high_resolution_clock::now();
    auto elapsed = std::chrono::high_resolution_clock::now() - start;
  
    long long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
    cout << x <<" iterations took " << microseconds << " microseconds\n";
  }
}

Then when I pass by reference, it averages less than 1 microsecond.

#include <iostream>
#include <chrono>

using namespace std;
using namespace chrono;

void func(unsigned long long& x) {
  ++x;
}

int main()
{
  for (int b = 0; b < 5; b++) {
    
    unsigned long long x = 0;
  
    auto start = high_resolution_clock::now();
    for (long long j = 0; j < 500000; j++) {
      for (long long i = 0; i < 180000000; i++) {
        func(x);
      }
    }
  
    auto stop = high_resolution_clock::now();
    auto elapsed = std::chrono::high_resolution_clock::now() - start;
  
    long long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
    cout << x <<" Took " << microseconds << " microseconds\n";
  }
}

I would like an explanation to why this happens please.

Disclaimer: I am fairly new to C++

  • 1
    Instead of the excessive verbosity and ambiguity of `unsigned long long` consider using a [more specific type](https://en.cppreference.com/w/cpp/types/integer) that communicates intent, like `uint64_t`. – tadman Mar 24 '23 at 18:43
  • 4
    Is this with a debug build or a fully optimized (e.g. `-O3`) build? I'm expecting that the optimized build just multiplies the two numbers and doesn't actually loop at all. Check with an assembly dump of the code like from [godbolt](https://godbolt.org). – tadman Mar 24 '23 at 18:43
  • For what it's worth, the first version does not take seconds using clang on an M1 machine, but then again, that's 90 *trillion* iterations if it's actually looping. To get a runtime less than minutes or hours, some compiler optimization must be taking place. – tadman Mar 24 '23 at 18:48
  • 3
    [That nested loop is gone when optimizations are enabled](https://godbolt.org/z/zbzfzv8x1). – PaulMcKenzie Mar 24 '23 at 18:55
  • Just to elaborate on why passing primitives (and a few other things) by value is in general faster is that they fit in one (or perhaps two) registers so the copy doesn't cost you anything. OTOH, if you pass by reference then accessimg the variable in the called function involves an additional dereference. Sometimes the compiler can optimise this out. And sometimes not. – Paul Sanders Mar 24 '23 at 19:51
  • In addition to the other comments, in the second version you are not only modifying the parameter passing, also the returned value. In the second version you are not returning anything, so there is less to do, like putting and retrieving the returned value in the stack. A more correct comparation would be to return x in the second case. – Andrés Alcarraz Mar 24 '23 at 20:06
  • You could also improve your question by stating the compiling parameters, so we can replicate, and also state the hardware. When I ran that code in eclipse, with the default compiler settings both codes lasted much more than a few secs. – Andrés Alcarraz Mar 24 '23 at 20:08
  • There are **two** differences in the code examples. They pass the argument differently **and** they handle the result differently. – Pete Becker Mar 24 '23 at 20:22
  • I ran the code with the same return value, and passing it the same way, and got the same result. I just forgot to put the most recent code in the post. I am running this on replit.com, not an actual computer. I don't know how to turn off compiler optimization, if you even can in replit. I will run it on a computer later, but I understand now why it does that. Thank you. – user21483880 Mar 24 '23 at 20:37
  • @PaulSanders: OP seems to have already read existing answers, like https://stackoverflow.com/a/14013189/103167 Your short comment is barely scratching the surface of the factors at play – Ben Voigt Mar 24 '23 at 21:22
  • @AndrésAlcarraz: The fact that one passes by value in and out while the other uses an in/out parameter passed by reference seems completely fair. The sudden change of type from `long long int` to just `int` is not equivalent, however. – Ben Voigt Mar 24 '23 at 21:34
  • Note that `return x;` throws away the higher bits... this wasn't always true, it used to be implementation-defined behavior when the value doesn't fit. – Ben Voigt Mar 24 '23 at 21:38
  • @BenVoigt Just felt it might be useful to provide the rationale for passing primitives by value, that's all. Like you, I don't know what the OP has and hasn't read. – Paul Sanders Mar 24 '23 at 21:40
  • @BenVoigt I don't think that is fair, because the rationale is for passing the value as parameter, not using it to return the result. – Andrés Alcarraz Mar 24 '23 at 22:26
  • @AndrésAlcarraz: in/out parameters are a perfectly reasonable topic to compare reference vs value – Ben Voigt Mar 25 '23 at 15:10
  • @BenVoigt it's not comparable, simply because they don't work in a function that must return another type. The rationale is about the amount of bytes needed to be passed through the stack. If it is an in/out parameter, then you are using half the size than in the case the value is returned. I mean, they are comparable, but in that specific use case, and in that specific use case the rationale does not apply. – Andrés Alcarraz Apr 07 '23 at 21:14
  • @AndrésAlcarraz: If the input and output types are different, then by-reference needs two parameters. The "amount of bytes needed to be passed through the stack" really has a pretty small effect compared to all other concerns (at least, the difference between one pointer-sized parameter or two is small. If you start passing a multi-kilobyte structure around, it starts to make a difference) – Ben Voigt Apr 07 '23 at 21:30
  • @BenVoigt but this is not the case, here the same parameter is used, I mentioned an inout parameter not one for in and another for out.. – Andrés Alcarraz Apr 07 '23 at 21:33
  • @AndrésAlcarraz: When input and output type are the same, it makes sense to compare performance of (by-value parameter and by-value return) to (in/out in one by-reference parameter). When input and output have difference types, it makes sense to compare performance of (by-value parameter and by-value return) to (two by-reference parameters, one in, one out). All of them are reasonable to discuss and measure. All accept one input and produce one output, so conceptually none is "doing more". The implementation may be faster or slower, more or fewer steps. But the task is the same. – Ben Voigt Apr 07 '23 at 21:39
  • @BenVoigt In that case, the rationale just not apply because of what I said earlier. – Andrés Alcarraz Apr 07 '23 at 22:11

0 Answers0