I have the following code in x64 Microsoft Macro Assembler (simplified example):
.DATA
First BYTE -4, -3, -2, -1, 0, 1, 2, 3
Second BYTE 1, 2, 3, 4, 5, 6, 7, 8
.CODE
MultiplyAndSum PROC
; move First and Second to vectors
; multiply corresponding elements
; sum the results
; return the sum
MultiplyAndSum ENDP
What I want to achieve in that procedure, is multiply corresponding bytes from the two arrays using SIMD (doesn't matter which registers are used exactly), then sum the results. So in this case, I want to do:
-4 * 1 + (-3) * 2 + ... + 3 * 8 = 24
and return 24.
Is this achievable using vector instructions?
From what I've seen, most multiplication instructions operate on WORDs or DWORDs - therefore, is there a way to split the multiplication into pieces and operate on for example WORDs instead of BYTEs?
The instructions pmaddwd
, pmullw
or pmulhw
seem of no use to me in this case. Are there any that I am missing?