Is returning a private class member slower than using a struct and accessing that variable directly?

Question

Suppose you have a class that has private members which are accessed a lot in a program (such as in a loop which has to be fast). Imagine I have defined something like this:

class Foo
{
public: 

    Foo(unsigned set)
        : vari(set)
    {}

    const unsigned& read_vari() const { return vari; }

private:
    unsigned vari;
};

The reason I would like to do it this way is because, once the class is created, "vari" shouldn't be changed anymore. Thus, to minimize bug occurrence, "it seemed like a good idea at the time".

However, if I now need to call this function millions of times, I was wondering if there is an overhead and a slowdown instead of simply using:

struct Foo
{
    unsigned vari;
};

So, was my first impule right in using a class, to avoid anyone mistakenly changing the value of the variable after it has been set by the constructor? Also, does this introduce a "penalty" in the form of a function call overhead. (Assuming I use optimization flags in the compiler, such as -O2 in GCC)?

They are not the same. With the method you are returning a copy, so you can't really access the variable. The equivalent, read only version would be `const unsigned& read_vari() const { return vari; }` — juanchopanza, Apr 17 '14 at 06:04
@juanchopanza. Thank you for your message. I will update my question shortly. — Mark Anderson, Apr 17 '14 at 06:05
In theory, without optimizations (particularly, inlining of your getter), yes. In practice, in a release build with the usual optimizations, both options will be compiled to the same assembly instructions. So yes, your impulse was right :) — heinrichj, Apr 17 '14 at 06:14
You can check the assembly yourself and be 100% sure. Instead of hoping someone will know what your compiler does :) — rozina, Apr 17 '14 at 06:31

niklasfi · Accepted Answer · 2014-04-17T06:41:02.577

They should come out to be the same. Remember that frustrating time you tried to use the operator[] on a vector and gdb just replied optimized out? This is what will happen here. The compiler will not create a function call here but it will rather access the variable directly.

Let's have a look at the following code

struct foo{
   int x;
   int& get_x(){
     return x;
   }   
};

int direct(foo& f){ 
   return  f.x;
}

int fnc(foo& f){ 
   return  f.get_x();
}

Which was compiled with g++ test.cpp -o test.s -S -O2. The -S flag tells the compiler to "Stop after the stage of compilation proper; do not assemble (quote from the g++ manpage)." This is what the compiler gives us:

_Z6directR3foo:
.LFB1026:
  .cfi_startproc
  movl  (%rdi), %eax
  ret

and

_Z3fncR3foo:
.LFB1027:
  .cfi_startproc
  movl  (%rdi), %eax
  ret

as you can see, no function call was made in the second case and they are both the same. Meaning there is no performance overhead in using the accessor method.

bonus: what happens if optimizations are turned off? same code, here are the results:

_Z6directR3foo:
.LFB1022:
  .cfi_startproc
  pushq %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset 6, -16
  movq  %rsp, %rbp
  .cfi_def_cfa_register 6
  movq  %rdi, -8(%rbp)
  movq  -8(%rbp), %rax
  movl  (%rax), %eax
  popq  %rbp
  .cfi_def_cfa 7, 8
  ret

and

_Z3fncR3foo:
.LFB1023:
  .cfi_startproc
  pushq %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset 6, -16 
  movq  %rsp, %rbp
  .cfi_def_cfa_register 6
  subq  $16, %rsp
  movq  %rdi, -8(%rbp)
  movq  -8(%rbp), %rax
  movq  %rax, %rdi
  call  _ZN3foo5get_xEv    #<<<call to foo.get_x()
  movl  (%rax), %eax
  leave
  .cfi_def_cfa 7, 8
  ret

As you can see without optimizations, the sturct is faster than the accessor, but who ships code without optimizations?

Actually, you can ship code without optimizations if you wish, but at that point the function call overhead of a getter should be your last worry: all the templated code (STL/Boost) will be so slow that you won't even notice the rest. — Matthieu M., Apr 17 '14 at 06:50
@MatthieuM. what are you referring to? Templates do make compiling slower but [I have not heard of templates slowing down the actual execution time](http://stackoverflow.com/questions/2442358/do-c-templates-make-programs-slow). Could you provide a source? — niklasfi, Apr 17 '14 at 06:55
Actually, it's just a natural consequence of your answer: most of the STL & Boost piles up layers upon layers of template classes and functions. When you compile with optimizations on, those layers are folded together until barely any code remains so it's fast; however without optimizations each layer add its own function call overhead. Just look at the average `std::vector::push_back` implementation and you'll see exactly what I mean :) — Matthieu M., Apr 17 '14 at 07:09

Tony Delroy · Answer 2 · 2014-04-17T06:21:21.543

You can expect identical performance. A great many C++ classes rely on this - for example, C++11's list::size() const can be expected to trivially return a data member. (Which contrasts with vector(), where the implementation's I've looked at calculate size() as the difference between pointer data member's corresponding to begin() and end(), ensuring typical iterator usage is as fast as possible at the cost of potentially slower indexed iteration, if the optimiser can't determine that size() is constant across loop iterations).

There's typically no particular reason to return by const reference for a type like unsigned that should fit in a CPU register anyway, but as it's inlined the compiler doesn't have to take that literally (for an out-of-line version it would likely be implemented by returning a pointer that has to be dereferenced). (The atypical reason is to allow taking the address of the variable, which is why say vector::operator[](size_t) const needs to return a const T& rather than a T, even if T is small enough to fit in a register.)

score 0 · Answer 3 · answered Apr 17 '14 at 07:02

There is only one way to tell with certainty which one is faster in your particular program built with your particular tools with your particular optimisation flags on your particular platform — by measuring both variants.

Having said that, chances are good that the binaries will be identical, instruction for instruction.

score 0 · Answer 4 · answered Apr 17 '14 at 07:23

As others have said, optimizers these days are relied on to boil out abstraction (especially in C++, which is more or less built to take advantage of that) and they're very, very good.

But you might not need the getter for this.

struct Foo {
    Foo(unsigned set) : vari(set) {}
    unsigned const vari;
};

const doesn't forbid initialization.

Is returning a private class member slower than using a struct and accessing that variable directly?

4 Answers4