So I am trying to use the SSE function __mm_load_128
, I am very new to SSE fo forgive me if I have made some silly mistakes somewhere.
Here is the code
void one(__m128i *arr, char *temp)
{
// SSE needs 16 byte alignment.
_declspec (align(16)) __m128i *tmp = (__m128i*) temp;
if (((uintptr_t)tmp & 15) == 0)
printf("Aligned pointer");
else
printf("%d", ((uintptr_t)tmp & 15)); // This prints as 12
arr[0] = _mm_load_si128(tmp);
}
I get an error on visual studio
0xC0000005: Access violation reading location 0xFFFFFFFF.
0xFFFFFFFF
does not look right, what am I doing wrong.
arr
argument is initialized as _m128i arr[5] = { 0 }
Alternative would be to use _mm_loadu_128
which works fine but as I understand it, It should produce movdqu
instruction but this is the assembly generated
arr[0] = _mm_loadu_si128(tmp);
00D347F1 mov eax,dword ptr [tmp]
00D347F4 movups xmm0,xmmword ptr [eax]
00D347F7 movaps xmmword ptr [ebp-100h],xmm0
00D347FE mov ecx,10h
00D34803 imul edx,ecx,0
00D34806 add edx,dword ptr [arr]
00D34809 movups xmm0,xmmword ptr [ebp-100h]
00D34810 movups xmmword ptr [edx],xmm0
Thanks guys, From the answers I realize I have made couple of mistakes.
Align the source use
_alinged_malloc
Compile with optimizations.
Use C++ casts not C