For example, we have a CPU with AVX512bw support. Now i want to run 3 types of string-length SIMD functions on this CPU.
- The first function takes 16 bytes (AVX) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
- The second function takes 32 bytes (AVX2) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
- The third function takes 64 bytes (AVX512bw) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
But I can't understand that for AVX512 CPU, the whole 3 functions must uses AVX512 instructions or just use their SIMD instructions ?
For example, for the first function, I have to use vmovdqa
or vmovdqa16
!!! ???
Or for the second function, I have to use vmovdqa
or vmovdqa32
!!! ???
Why there are such vmovdqa16
, vmovdqa32
and ... instructions when we just can use their AVX
or AVX2
instructions ??!!
Is it possible to use AVX, AVX2 instructions in a AVX512 function ?? Or we must use the AVX512 version of those instructions ?