There's an interesting optimization problem I'm facing.
In a large code base, consisting of a large number of classes, in many places the value of a non-constant global (=file scope) variable is very often used/examined and the unnecessary memory accesses of this variable are to be avoided.
This variable is initialized once, but because of the complexity of its initialization and the need to call a number of functions, it cannot be initialized like this, before execution of main()
:
unsigned size = 1000;
int main()
{
// some code
}
or
unsigned size = CalculateSize();
int main()
{
// some code
}
Instead it has to be initialized like this:
unsigned size;
int main()
{
// some code
size = CalculateSize();
// lots of code (statically/dynamically created class objects, whatnot)
// that makes use of "size"
return 0;
}
Just because size
isn't a constant and it is global (=file scope) and the code is large and complex, the compiler is unable to infer that size
never changes after size = CalculateSize();
. The compiler generates code that fetches and refetches the value of size
from the variable and can't "cache" it in a register or in a local (on-stack) variable that's likely to be in the CPU's d-cache together with other frequently accessed local variables.
So, if I have something like the following (a made-up example for illustrative purposes):
size = CalculateSize();
if (size > 200) blah1();
blah2();
if (size > 200) blah3();
The compiler thinks that blah1()
and blah2()
may change size
and it generates a memory read from size
in if (size > 200) blah3();
.
I'd like to avoid that extra read whenever and wherever possible.
Obviously, hacks like this:
const unsigned size = 0;
int main()
{
// some code
*(unsigned*)&size = CalculateSize();
// lots more code
}
won't do as they invoke undefined behavior.
The question is how to inform the compiler that it can "cache" the value of size
once size = CalculateSize();
has been performed and do it without invoking undefined behavior, unspecified behavior and, hopefully, implementation-specific behavior.
This is needed for C++03 and g++ (4.x.x). C++11 may or may not be an option, I'm not sure, I'm trying to avoid using advanced/modern C++ features to stay within the coding guidelines and predefined toolset.
So far I've only come up with a hack to create a constant copy of size
within every class that's using it and use the copy, something like this (decltype
makes it C++11, but we can do without decltype
):
#include <iostream>
using namespace std;
volatile unsigned initValue = 255;
unsigned size;
#define CACHE_VAL(name) \
const struct CachedVal ## name \
{ \
CachedVal ## name() { this->val = ::name; } \
decltype(::name) val; \
} _CachedVal ## name;
#define CACHED(name) \
_CachedVal ## name . val
class C
{
public:
C() { cout << CACHED(size) << endl; }
CACHE_VAL(size);
};
int main()
{
size = initValue;
C c;
return 0;
}
The above may only help up to a point. Are there better and more suggestive-to-the-compiler alternatives that are legal C++? Hoping for a minimally intrusive (source-code-wise) solution.
UPDATE: To make it a bit more clear, this is in a performance-sensitive application. It's not that I'm trying to get rid of unnecessary reads of that particular variable out of whim. I'm trying to let/make the compiler produce more optimal code. Any solution that involves reading/writing another variable as often as size
and any additional code in the solution (especially with branching and conditional branching) executed as often as size
is referred to is also going to affect the performance. I don't want to win in one place only to lose the same or even more in another place.
Here's a related non-solution, causing UB (at least in C).