Runtime performance (speed) optimization -- Cache size consideration

Question

What are the basic tips and tricks that a C++ programmer should know when trying to optimize his code in the context of Caching?

Here's something to think about:

For instance, I know that reducing a function's footprint would make the code run a bit faster since you would have less overall instructions on the processor's instruction register I.
When trying to allocate an std::array<char, <size>>, what would be the ideal size that could make your read and writes faster to the array?
How big can an object be to decide to put it on the heap instead of the stack?

First thing to consider is how to measure WHERE your code is spending its time (from a cpu or response time performance point-of-view). That way, you spend your real life time working on improving things that will have a positive impact on your users. — ErstwhileIII, Aug 23 '14 at 14:16

maxy · Answer 1 · 2014-08-23T20:11:46.027

In most cases, knowing the correct answer to your question will gain you less than 1% overall performance.

Some (data-)cache optimizations that come to my mind are:

For arrays: use less RAM. Try shorter data types or a simple compression algorithm like RLE. This can also save CPU at the same time, or in the opposite waste CPU cycles with data type conversions. Especially floating point to integer conversions can be quite expensive.
Avoid access to the same cacheline (usually around 64 bytes) from different threads, unless all access is read-only.
Group members that are often used together next to each other. Prefer sequential access to random access.

If you really want to know all about caches, read What Every Programmer Should Know About Memory. While I disagree with the title, it's a great in-depth document.

Because your question suggests that you actually expect gains from just following the tips above (in which case you will be disappointed), here are some general optimization tips:

Tip #1: About 90% of your code you should be optimized for readability, not performance. If you decide to attempt an optimization for performance, make sure you actually measure the gain. When it is below 5% I usually go back to the more readable version.

Tip #2: If you have an existing codebase, profile it first. If you don't profile it, you will miss some very effective optimizations. Usually there are some calls to time-consuming functions that can be completely eliminated, or the result cached.

If you don't want to use a profiler, at least print the current time in a couple of places, or interrupt the program with a debugger a couple of times to check where it is most often spending its time.

Runtime performance (speed) optimization -- Cache size consideration

1 Answers1