Reducing Instruction Cache misses (in C++)

Question

Let's say I have a C++ class whose implementation looks something like this:

// ...

MyClass::iterativeFunction() {
     for (int i = 0; i < 1000000; i++) {
          performAction(i);
     }
}

MyClass::performAction(int index) {
     // Block of code (non-inline-able)
}

// ...

At the C++ level, do I have any control over the spacial locality of these methods, or do I just have to hope the compiler will notice the related methods and optimise its assembly accordingly? Ideally I would like them to be right next to each other so they will be loaded into the instruction cache together, but I have no idea how to let the compiler know I'd really like this to happen.

"I would like them to be right next to each other so they will be loaded into the instruction cache together." That's not how any modern CPU's instruction cache works. It doesn't fetch code just because it happens to be near other code. It fetches code because that code is invoked. — David Schwartz, Nov 01 '12 at 04:00
Mark 'performAction' as inline/_forceinline/__pleasepleaseinline? — James, Nov 01 '12 at 04:06
With whole-program optimization, a compiler is quite likely to spot that the only caller of `MyClass::performAction` is `iterativeFunction`, and still inline it. Functions with only one caller have a much lower threshold for inlining. — MSalters, Nov 01 '12 at 09:50
Unless you have >4kb executable code, in which case code locality _might_ matter, though at a completely different level. — Mooing Duck, Jul 10 '13 at 02:48

score 5 · Accepted Answer · answered Nov 01 '12 at 04:03

5

In either case, the code can't run until it gets into the cache. In either case, it will be equally obvious to the CPU where the code flow goes because the flow is unconditional. So it won't make any difference. A modern code cache doesn't fetch ahead in address space, it fetches ahead in the instruction flow, following unconditional branches and predicting conditional branches as necessary.

So there is no reason to care about this. It won't make any difference.

answered Nov 01 '12 at 04:03

David Schwartz

179,497
17
214
278

Ah okay, thank you. I didn't realise that the instruction fetch follows the instruction flow. That makes my life quite a bit easier. – Ephemera Nov 01 '12 at 05:00

score 2 · Answer 2 · answered Nov 01 '12 at 03:56

Technically speaking, no. However, on modern-ish processors you don't generally need to worry about the instruction cache nearly as much as you do the data cache, unless you have a very large executable or really horrific branches everywhere.

The reasons is that the cache lines are only around 64 bytes long, which means that if your methods are larger than 64 bytes (and they are), they will need to be loaded into multiple cache entries even if they are directly next to each other in physical memory.

It makes no sense to think like the OP is when coding for any modern CPU. — David Schwartz, Nov 01 '12 at 04:01

score 1 · Answer 3 · answered Nov 01 '12 at 03:54

1

If you need that level of control and optimization then C++ is the wrong language for you.

But the actual answer to your question is "No".

answered Nov 01 '12 at 03:54

John3136

28,809
4
51
69

score 0 · Answer 4 · answered Nov 01 '12 at 03:53

0

No, as far as I know there is no way for you to specify the location of your methods. If C++ allowed nested procedures that would be one way to ensure that the called procedure was local.

answered Nov 01 '12 at 03:53

Shark8

4,095
1
17
31

Reducing Instruction Cache misses (in C++)

4 Answers4