Optimize non-cost variable access

Question

There's an interesting optimization problem I'm facing.

In a large code base, consisting of a large number of classes, in many places the value of a non-constant global (=file scope) variable is very often used/examined and the unnecessary memory accesses of this variable are to be avoided.

This variable is initialized once, but because of the complexity of its initialization and the need to call a number of functions, it cannot be initialized like this, before execution of main():

unsigned size = 1000;

int main()
{
  // some code
}

or

unsigned size = CalculateSize();

int main()
{
  // some code
}

Instead it has to be initialized like this:

unsigned size;

int main()
{
  // some code
  size = CalculateSize();
  // lots of code (statically/dynamically created class objects, whatnot)
  // that makes use of "size"
  return 0;
}

Just because size isn't a constant and it is global (=file scope) and the code is large and complex, the compiler is unable to infer that size never changes after size = CalculateSize();. The compiler generates code that fetches and refetches the value of size from the variable and can't "cache" it in a register or in a local (on-stack) variable that's likely to be in the CPU's d-cache together with other frequently accessed local variables.

So, if I have something like the following (a made-up example for illustrative purposes):

  size = CalculateSize();
  if (size > 200) blah1();
  blah2();
  if (size > 200) blah3();

The compiler thinks that blah1() and blah2() may change size and it generates a memory read from size in if (size > 200) blah3();.

I'd like to avoid that extra read whenever and wherever possible.

Obviously, hacks like this:

const unsigned size = 0;

int main()
{
  // some code
  *(unsigned*)&size = CalculateSize();
  // lots more code
}

won't do as they invoke undefined behavior.

The question is how to inform the compiler that it can "cache" the value of size once size = CalculateSize(); has been performed and do it without invoking undefined behavior, unspecified behavior and, hopefully, implementation-specific behavior.

This is needed for C++03 and g++ (4.x.x). C++11 may or may not be an option, I'm not sure, I'm trying to avoid using advanced/modern C++ features to stay within the coding guidelines and predefined toolset.

So far I've only come up with a hack to create a constant copy of size within every class that's using it and use the copy, something like this (decltype makes it C++11, but we can do without decltype):

#include <iostream>

using namespace std;

volatile unsigned initValue = 255;
unsigned size;

#define CACHE_VAL(name) \
const struct CachedVal ## name \
{ \
  CachedVal ## name() { this->val = ::name; } \
  decltype(::name) val; \
} _CachedVal ## name;

#define CACHED(name) \
  _CachedVal ## name . val

class C
{
public:
  C() { cout << CACHED(size) << endl; }
  CACHE_VAL(size);
};

int main()
{
  size = initValue;
  C c;
  return 0;
}

The above may only help up to a point. Are there better and more suggestive-to-the-compiler alternatives that are legal C++? Hoping for a minimally intrusive (source-code-wise) solution.

UPDATE: To make it a bit more clear, this is in a performance-sensitive application. It's not that I'm trying to get rid of unnecessary reads of that particular variable out of whim. I'm trying to let/make the compiler produce more optimal code. Any solution that involves reading/writing another variable as often as size and any additional code in the solution (especially with branching and conditional branching) executed as often as size is referred to is also going to affect the performance. I don't want to win in one place only to lose the same or even more in another place.

Here's a related non-solution, causing UB (at least in C).

Have you tried `const_cast`? Looks like this is the perfect use case for it. — Vinícius Gobbo A. de Oliveira, Nov 10 '13 at 00:45
@ViníciusGobboA.deOliveira What exactly are you suggesting? — Alexey Frunze, Nov 10 '13 at 00:51
@AlexeyFrunze You should really try to profile with the return of a static const variable approach - check my edited answer. — ScarletAmaranth, Nov 10 '13 at 02:43
Just read the documentation of `const_cast`, and it can not be used for this. It will cause an undefined behavior, meaning it is not an option. — Vinícius Gobbo A. de Oliveira, Nov 10 '13 at 18:53

score 2 · Answer 1 · answered Nov 10 '13 at 00:45

2

There's the register keyword in C++ which tells the compiler you plan on using a variable a lot. Don't know about the compiler you're using, but most of the modern compilers do that for the users, adding a variable into the registry if needed. You can also declare the variable as constant and initialize it using const_cast.

answered Nov 10 '13 at 00:45

Paweł Stawarz

3,952
2
17
26

I don't think `register` is very meaningful these days. Further, you can't use `register` at file scope. – Alexey Frunze Nov 10 '13 at 00:53
It's not, that's why I said 'modern compilers do that for the users'. But I don't know about the compiler @Alexey Frunze is using. I didn't know about the file sope, good to know! – Paweł Stawarz Nov 10 '13 at 00:56
I meant @carl is using. No idea why it changed to your name, and I can't edit the comment. – Paweł Stawarz Nov 10 '13 at 01:49
I don't really know g++, and can't really find info about `registry` in it. But I guess the best way to find out is just add the keyword and test! – Paweł Stawarz Nov 10 '13 at 02:49

ScarletAmaranth · Answer 2 · 2013-11-10T02:40:46.547

#include <iostream>

unsigned calculate() {
    std::cout<<"calculate()\n";
    return 42;
}

const unsigned mySize() {
    std::cout<<"mySize()\n";
    static const unsigned someSize = calculate();
    return someSize;
}

int main() {
    std::cout<<"main()\n";
    mySize();
}

prints:

main()  
mySize()  
calculate()

on GCC 4.8.0

Checking for whether it has been initialized already or not will be almost fully mitigated by the branch predictor. You will end up having one false and a quadrillion trues afterwards.

Yes, you will still have to access that state after the pipeline has been basically built, potentially wreaking havoc in the caches, but you can't be sure unless you profile. Also, compiler can likely do some extra magic for you (and it is what you're looking for), so I suggest you first compile and profile with this approach before discarding it entirely.

Carl · Answer 3 · 2013-11-10T02:08:32.977

0

what of:

const unsigned getSize( void )
{
  static const unsigned size = calculateSize();
  return size;
}

This will delay the initialization of size until the first call to getSize(), but still keep it const.

GCC 4.8.2

edited Nov 10 '13 at 02:08

answered Nov 10 '13 at 01:03

Carl

993
8
14

Isn't `calculateSize()` going to be called before `main()`? – Alexey Frunze Nov 10 '13 at 01:07
@AlexeyFrunze It will be called the first time you call `getSize()` – Etherealone Nov 10 '13 at 01:17
OK, but then there's another question... In order to distinguish between the states of that static variable (initialized vs uninitialized), there must be some code to read another variable from memory, right? – Alexey Frunze Nov 10 '13 at 01:25
"there must be some code to read another variable from memory, right?" I don't quite understand what you mean. As it is, the function is sort of a fire-and-forget. As long as you can ensure that the first time you call it that calculateSize() will work, you don't really need to know if it has been called... Unless you do :). Could you explain more? – Carl Nov 10 '13 at 01:31
Obviously, the static varibale must be initialized only once. How does `getSize()` know if this initialization has already occurred or not? It must maintain state to be able to answer this question. The state is in the memory somewhere. So, instead of reading `size` many times we may end up reading that state many times. Also, it's not only memory reads that count, it's also the code that does conditional branching and any code to work with that initialized-or-uninitialized state. All of that impairs performance. – Alexey Frunze Nov 10 '13 at 01:52
Ah, I get it, but I have no idea. – Carl Nov 10 '13 at 02:05

Optimize non-cost variable access

3 Answers3