template<size_t size>
class Objects{
std::array<int,size> a;
std::array<int,size> b;
std::array<int,size> c;
void update(){
for (size_t i = 0; i < size; ++i){
c[i] = a[i] + b[i];
}
}
};
I am gathering information of how to write cache friendly code since a week and I read though several articles but I still haven't understood the basics.
Code like I have written above is used in most of the examples, but for me this is not cache friendly at all.
For me the memory layout should look like this
aaaabbbbcccc
and in the first loop it will access
[a]aaa[b]bbb[c]ccc
If I understood it correctly, the cpu prefetches elements that are near in memory. I am not sure how intelligent this method is but I assume it's primitive and it just fetches the nth nearest elements.
The problem is that [a]aaa[b]bbb[c]ccc
will not access the elements in order at all. So it might fetch the next '3' elements a[aaa]bbbbcccc
which is nice for the next a because it will be a cache hit but not for the b.
Is the example above cache friendly code?