0

I use an assembler insert and cannot complete the task. I need to add all the elements of the array using MMX

int32_t matr1[10] = { 3,10,100,1000,2,40,200,3}; // first matrix
 int32_t matr2[6] = { 1,0,4,6, 3,7}; // second matrix
 int32_t result[10] = {}; //result
...
__asm{
     lea esi, matr1
    //xor ecx,ecx
    movq mm0,[esi]
      add esi,8
     metka:
     paddw mm0,[esi]
     add ecx,4
     add esi,8
     cmp ecx,8
     jb metka

         movq [result],mm0
         movzx eax, [result]
         movzx edx, [result + 2]
         add eax, edx
         movzx edx, [result + 4]
         add eax, edx
         movzx edx, [result + 6]
         add eax, edx
         mov [b],eax
    }

I tried to rewrite another mmx code in c++. It writes all variables to a register, and then to the result array. And theoretically adds up. But the addition of variables comes out wrong

Kris
  • 13
  • 3
  • 3
    Any reason why MMX in particular? Modern CPUs have much better SIMD instruction sets. Are you a time traveler from the late 90s? – Nate Eldredge May 29 '21 at 21:28
  • 1
    I study at the university – Kris May 29 '21 at 21:30
  • Welcome to SO. Your question lacks some important details. PLease read [ask], then edit question. Else, it will get closed. – kebs May 29 '21 at 21:40
  • 1
    Your array is an array of `int32_t`, but the rest of your code seems oriented around adding 16-bit integers? – Nate Eldredge May 29 '21 at 21:55
  • 1
    The examples in [Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926) show using SSE2 `paddd` to add int32_t elements. Doing 8 bytes at a time with `mm0` instead of 16 with `xmm0` is the same thing. After the loop, [Fastest way to do horizontal SSE vector sum (or other reduction)](https://stackoverflow.com/q/6996764) shows how to shuffle and add the 32-bit elements. (Only one step needed for an MMX register with 2x 32-bit instead of 4x. Or if you actually have word elements, then MMX pshufw / paddw / pshufw / paddw, like SSE2 pshufd) – Peter Cordes May 29 '21 at 21:59

0 Answers0