C/C++ optimizing away checks to see if a function has already been run before

Question

Let's say you have a function in C/C++, that behaves a certain way the first time it runs. And then, all other times it behaves another way (see below for example). After it runs the first time, the if statement becomes redundant and could be optimized away if speed is important. Is there any way to make this optimization?

bool val = true; 

void function1() {

   if (val == true) {
      // do something
      val = false; 
   }
   else {
      // do other stuff, val is never set to true again 
   }

}

This is in the domain of self-modifying code. I doubt you'll be able to do it directly in c or c++ — Bwmat, Oct 19 '12 at 21:50
sure you can. static local boolean will tell the compiler you want a specific bit to run once and only once. — Mooing Duck, Oct 20 '12 at 00:10
If this is for a laptop/desktop CPU, then the answer is "It doesn't matter because at most a handful of jumps will be mispredicted by any CPU now in use, each wasting ~1ns". — j_random_hacker, Oct 20 '12 at 01:07
@j_random_hacker however, there are micro controllers, ICs, mobile devices, ... where even a single cycle may not be nice to lose. µCs also tend to not have branch prediction — dualed, Oct 20 '12 at 01:48
I often wonder if just this kind of thing is handled by typical JIT compilers. Good question. — hippietrail, Oct 20 '12 at 05:03
@hippietrail, Hotspot (JVM) use self-modifable code as the norm, in java7 there is INVOKE_DYNAMIC that when implemented correctly will do what the OP wants and it's part of the language (not really java but java byte code). `static final` in java tend to be turned to constant and then constant folded, the code needs a bit extra trickery but it's totally possible to result in code that has no load of `val` and subsequent branch. — bestsss, Oct 24 '12 at 11:59
Thanks @bestsss. I wonder if any of the current JavaScript jits also implement this. I would hope so. — hippietrail, Oct 24 '12 at 12:07
@hippietrail, again the original code won't be optimized by hotspot as is, you need `static final` that would require another class to load and during the class load to execute the one-time stuff. Even unoptimized the code is fast enough when the branch is predicted, which should be over 99% if the code is 'hot', if it's not 'hot' by definition there is no point to optimize it at all. — bestsss, Oct 24 '12 at 12:12
I think some of the answers and comments made a good point that plenty of embedded devices don't have branch prediction. I'm not even sure if ARM has it. — hippietrail, Oct 24 '12 at 12:16
@hippietrail, *I'm not even sure if ARM has it* sure it does; But I do not think any JIT language works on a CPU w/o branch prediction. If the performance matters so much, self-modifying code aint so hard if you have access to assembler to the said platform. Back in the old days of 6502 modifying the address in the code (or especially a counter) was quite a standard practice. — bestsss, Oct 24 '12 at 13:57
Yeah I remember self modifying code in games and demos back in the 8-bit days and even on the early versions of the Motorola 680x0. I wonder if there are any compilers for embedded CPUs that use it. I might ask (-: — hippietrail, Oct 24 '12 at 14:08
@bestsss: [Is there any self-improving compiler around?](http://stackoverflow.com/questions/1585067) — hippietrail, Oct 24 '12 at 14:32

score 17 · Answer 1 · answered Oct 19 '12 at 21:53

17

gcc has a builtin function that let you inform the implementation about branch prediction:

 __builtin_expect

http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

For example in your case:

bool val = true; 

void function1()
{
    if (__builtin_expect(val, 0)) {
       // do something
       val = false; 
    }
    else {
      // do other stuff, val is never set to true again 
    }
}

answered Oct 19 '12 at 21:53

ouah

142,963
15
272
331

2

Useful to know about `__builtin_expect`, but it will have negligible effect here: at most a handful of jumps will be mispredicted before the branch predictor "realises" what's happening. It seems like it would be more useful when there's a bias towards one of the two outcomes but no discernible regular pattern. – j_random_hacker Oct 20 '12 at 01:03
1

Best answer I see here. No premature optimization, just giving the compiler additional information. – dualed Oct 20 '12 at 01:51
1

@j_random_hacker: unless the function calls are far enough apart that the instruction is flushed out of cache between runs. Also worth noting that this flag may result in reordering optimizations to speed up the expected branch. – nneonneo Oct 20 '12 at 05:38
Additional to the __builtin_expect, which I'd only add if this is really heavily used code and there is really no other way to solve it, it's good practice to put the "usual" case in the if- and the "seldom" case in the else-branch. Might not help on all target platforms, but can make a difference on some. – Andreas Oct 20 '12 at 08:15
@nneonneo: If it's flushed out of cache, it must be running relatively rarely, suggesting that it's not worth fretting over an extra cycle our two. Might still be worthwhile for embedded work though. – j_random_hacker Oct 22 '12 at 06:16

score 16 · Accepted Answer · answered Oct 19 '12 at 21:50

16

You should only make the change if you're certain that it truly is a bottleneck. With branch-prediction, the if statement is probably instant, since it's a very predictable pattern.

That said, you can use callbacks:

#include <iostream>
using namespace std;
typedef void (*FunPtr) (void);
FunPtr method;
void subsequentRun()
{
    std::cout << "subsequent call" << std::endl;
}
void firstRun()
{
    std::cout << "first run" << std::endl;
    method = subsequentRun;  
}
int main()
{
    method = firstRun;
    method();
    method();
    method();
}

produces the output:

first run
subsequent call
subsequent call

answered Oct 19 '12 at 21:50

Luchian Grigore

253,575
64
457
625

3

I'd want to benchmark whether this actually makes things better or worse. If the compiler now has to treat the target of the call as unknown, having the target contain one fewer branch could be completely outweighed by the inability to apply optimizations across the call. – Ben Oct 20 '12 at 10:44
@Ben definitely. I doubt there'll be any difference at all. – Luchian Grigore Oct 20 '12 at 12:12

score 9 · Answer 3 · answered Oct 19 '12 at 21:50

9

You could use a function pointer but then it will require an indirect call in any case:

void (*yourFunction)(void) = &firstCall;

void firstCall() {
 ..
 yourFunction = &otherCalls;
}

void otherCalls() {
 ..
}

void main()
{
  yourFunction();
}

answered Oct 19 '12 at 21:50

Jack

131,802
30
241
343

2

I don't know for sure, but I suspect this is slower than the original code. – GManNickG Oct 19 '12 at 22:52
It is as slow as _roughly_ a virtual call. It could be slower indeed, I specified the fact that it needs an indirect call. – Jack Oct 19 '12 at 23:15
Right, my point is that he's asking to avoid the overhead, not introduce more*. (*Still TBD, but I'm pretty sure it's worse.) – GManNickG Oct 19 '12 at 23:22
indirect call would be worse on most architectures, it requires the load of the address to have been completed to fetch the next instructions. On the original case the branch prediction will be successful like 99.9% so it will cost just the load of the `val` – bestsss Oct 24 '12 at 12:08

score 4 · Answer 4 · answered Oct 19 '12 at 21:47

One possible method is to compile two different versions of the function (this can be done from a single function in the source with templates), and use a function pointer or object to decide at runtime. However, the pointer overhead will likely outweigh any potential gains unless your function is really expensive.

score 4 · Answer 5 · answered Oct 19 '12 at 21:51

4

You could use a static member variable instead of a global variable..

Or, if the code you're running the first time changes something for all future uses (eg, opening a file?), you could use that change as a check to determine whether or not to run the code (ie, check if the file is open). This would save you the extra variable. Also, it might help with error checking - if for some reason the initial change is be unchanged by another operation (eg, the file is on removable media that is removed improperly), your check could try to re-do the change.

answered Oct 19 '12 at 21:51

Sveltely

822
1
8
22

1

What's the advantage of a static over a global? – Luchian Grigore Oct 19 '12 at 21:51
2

in a large project, global variables tend to clutter up rather quickly.. I'd say it's generally just good coding practice, though there's probably some memory access advantage.. (don't quote me) – Sveltely Oct 19 '12 at 21:56
3

@LuchianGrigore The advantage is two-fold. First, cache locality. See http://en.wikipedia.org/wiki/Locality_of_reference. Second, if you use a local static instead of a global, the compiler might be able to more easily deduct that its value will not change again, and thus optimize accordingly. – Nikos C. Oct 19 '12 at 21:56
3

@Nikos: cache locality *with what*? What it is that the local static will be closer to and the global further away from? – Steve Jessop Oct 19 '12 at 22:42
@Steve Hmm, that was a brain fart on my part. The var is static to begin with, so it ends up in the data segment or BSS anyway. No cache locality can apply. If it weren't static (doesn't make sense of course, but anyway) then having the var defined close to the place you use it helps avoid cache misses. – Nikos C. Oct 19 '12 at 22:58

score 1 · Answer 6 · answered Oct 19 '12 at 21:50

1

A compiler can only optimize what is known at compile time.

In your case, the value of val is only known at runtime, so it can't be optimized.

The if test is very quick, you shouldn't worry about optimizing it.

answered Oct 19 '12 at 21:50

alestanis

21,519
4
48
67

score 1 · Answer 7 · answered Oct 22 '12 at 14:14

If you'd like to make the code a little bit cleaner you could make the variable local to the function using static:

void function() {
    static bool firstRun = true;
    if (firstRun) {
        firstRun = false;
        ...
    }
    else {
        ...
    }
}

On entering the function for the first time, firstRun would be true, and it would persist so each time the function is called, the firstRun variable will be the same instance as the ones before it (and will be false each subsequent time).

This could be used well with @ouah's solution.

hexist · Answer 8 · 2012-10-19T21:53:08.353

Compilers like g++ (and I'm sure msvc) support generating profile data upon a first run, then using that data to better guess what branches are most likely to be followed, and optimizing accordingly. If you're using gcc, look at the -fprofile-generate option.

The expected behavior is that the compiler will optimize that if statement such that the else will be ordered first, thus avoiding the jmp operation on all your subsequent calls, making it pretty much as fast as if it wern't there, especially if you return somewhere in that else (thus avoiding having to jump past the 'if' statements)

score 0 · Answer 9 · answered Oct 19 '12 at 22:02

One way to make this optimization is to split the function in two. Instead of:

void function1()
{
    if (val == true) {
        // do something
        val = false; 
    } else {
       // do other stuff
    }
}

Do this:

void function1()
{
    // do something
}

void function2()
{
   // do other stuff
}

score 0 · Answer 10 · edited May 23 '17 at 12:15

One thing you can do is put the logic into the constructor of an object, which is then defined static. If such a static object occurs in a block scope, the constructor is run the fist time that an execution of that scope takes place. The once-only check is emitted by the compiler.

You can also put static objects at file scope, and then they are initialized before main is called.

I'm giving this answer because perhaps you're not making effective use of C++ classes.

(Regarding C/C++, there is no such language. There is C and there is C++. Are you working in C that has to also compile as C++ (sometimes called, unofficially, "Clean C"), or are you really working in C++?)

What is "Clean C" and how does it differ from standard C?

score 0 · Answer 11 · answered Oct 20 '12 at 06:57

To remain compiler INDEPENDENT you can code the parts of if() in one function and else{} in another. almost all compilers optimize the if() else{} - so, once the most LIKELY being the else{} - hence code the occasional executable code in if() and the rest in a separate function that's called in else

C/C++ optimizing away checks to see if a function has already been run before

11 Answers11