I am trying to sort 8 integers with AVX2 inline assembly in GCC. The code to find the ranks works, assuming no duplicate values. But when I try to use vpermd
to sort the array using the ranks it fails. I have isolated the issue in this code:
#include <stdio.h>
#include <x86intrin.h>
int main() {
__m256i C __attribute__ ((aligned (32)));
int array[8] __attribute__ ((aligned (32))) = {40,70,60,20,0,30,50,10};
int ranks[8] __attribute__ ((aligned (32))) = {4, 7, 6, 2, 0,3, 5, 1};
__asm__ (
"VMOVDQA %1,%%ymm1;" // loads ranks
"VMOVDQA %2,%%ymm2;" // loads array
"VPERMD %%ymm1,%%ymm2,%%ymm1;"
"VMOVDQA %%ymm1,%0;"
: "=m" (C)
: "m" (*(__m256i*)array) ,
"m" (ranks)
);
int* c = (int*)&C;
for(int i=0; i<8; i++)
printf("%d ",c[i]);
// 0 10 50 60 40 20 30 70
return 0;
}
You can review the output of this program on Rextester. The output I am getting is:
0 10 50 60 40 20 30 70
As you can see they are not properly sorted. Have I misunderstood what vpermd
does or written the wrong code?