My CPU doesn't support AVX512 so unfortunately I can't use the function _mm256_loadu_epi32()
.
It looks like I can use "_mm256_set_epi32()", but I'm not sure if it's hopelessly slower than "_mm256_loadu_XXXXXX()". Any idea? what's the best way to do this?