Is performance of using readonly variable directly different from storing in object then using it?

Question

I have a simple function which is similar to this :

int foo(const Data& a, int pos)
{
    Big_obj x= (pos==1?a.x : a.y);//I just use x for reading
    return x.elem*x.elem;
}

(suppose that I forgot to store object by reference const Big_obj& x=...)

Can today compilers optimize that into following code ? :

int foo(const Data& a, int pos)
{
    if (pos == 1)
      return a.x.elem * a.x.elem;
    else 
      return a.y.elem * a.y.elem;
}

Why don't you make (hyper-trivial) examples of both, compile (with varying levels of optimization if you're so inclined) both, and take a peek at the assembly? Personally, I believe they will be optimized to approximately, but not the exact same, code... until you crank up the optimization. But take that guess with a grain of salt, I've only done similar things with for loops. — druckermanly, Jun 07 '14 at 08:37
The compiler would have to proof that the (copy-)constructor of `Big_obj` has no side effects. That is quite difficult. I would look at the assembler code and see. — nwp, Jun 07 '14 at 08:38

score 3 · Accepted Answer · edited May 23 '17 at 10:32

With GCC there is a very useful compiler switch: -fdump-tree-optimized which will show the code after the performed optimizations.

You can discover that it all depends on Big_obj.

E.g.

struct Big_obj
{
  int elem;
  int vect[1000];
};

struct Data { Big_obj x, y; };

and g++ -Wall -O3 -fdump-tree-optimized will produce a .165t.optimized file containing:

int foo(const Data&, int) (const struct Data & a, int pos)
{
  int x$elem;
  const struct Big_obj * iftmp.4;
  int _8;

  <bb 2>:
  if (pos_2(D) == 1)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  iftmp.4_4 = &a_3(D)->x;
  goto <bb 5>;

  <bb 4>:
  iftmp.4_5 = &a_3(D)->y;

  <bb 5>:
  # iftmp.4_1 = PHI <iftmp.4_4(3), iftmp.4_5(4)>
  x$elem_7 = MEM[(const struct Big_obj &)iftmp.4_1];
  _8 = x$elem_7 * x$elem_7;
  return _8;
}

This is exactly the optimized code you posted. But if you change Big_obj (vect type has been changed from array to std::vector):

struct Big_obj
{
  int elem;
  std::vector<int> vect;
};

the optimization won't be performed anymore (i.e. foo() will also allocate/deallocate memory for x.vect).

In the example the reason is that optimizations must be implemented according to the as-if rule: only code transformations that do not change the observable behavior of the program are allowed.

operator new could have a custom implementation counting how many times it gets called (and this is hard to detect). But even without a custom operator new, there are other issues (see Does allocating memory and then releasing constitute a side effect in a C++ program?).

With other compilers you have to study the assembly code (e.g. clang++ -S switch), but the same holds true.

I try that in gcc . It's interesting to know in gcc it does not even call that function and compute it directly :) https://www.dropbox.com/s/k5bhbvg1ino0mro/optimize.cpp and https://www.dropbox.com/s/4fiyhzx5zix6d88/main.cpp — uchar, Jun 07 '14 at 10:58

Is performance of using readonly variable directly different from storing in object then using it?

1 Answers1