In intel intrinsics guide there are a few that allow to store parts of a wide register. I mean _mm_maskstore
, _mm_mask_store
and _mm_mask_compressstoreu
like.
The question is, is it OK to use them if my thread doesn't own part of the cacheline where they'd land or it's past the end of the current page?
Example:
struct S {
std::int16_t write_here[10];
std::atomic<std::int16_t> other_thread_can_use_this;
};
Can I write with one simd store to write_here
? Or it can corrupt the data from other_thread_can_use_this
(by loading it and then writing that back again for example)?