I am trying to wrap my head around memory accesses to intrinsic types, that have or haven't been loaded into registers.
Assuming some SIMD functions which accept references to float arrays. For example,
void do_something(std::array<float, 4>& arr);
void do_something_else(std::array<float, 4>& arr);
Each function first loads the data in registers, performs its operation, then stores the result back into the array. Assuming the following snippet :
std::array<float, 4> my_arr{0.f, 0.f, 0.f, 0.f};
do_something(my_arr);
do_something_else(my_arr);
do_something(my_arr);
Does the c++ compiler optimize out the unnecessary loads and stores between function calls? Does this even matter?
I've seen libraries that wrap an __m128
type in a struct, and call the load in the constructor. What happens when you store these on the heap and try to call intrinsics on them? For example,
struct vec4 {
vec4(std::array<float, 4>&) {
// do load
}
__m128 data;
};
std::vector<vec4> my_vecs;
// do SIMD work
Do you have to load/store the data every access? Or should these classes declare a private operator new
, so they aren't stored on the heap?