The following code aims to divide each packed single floating-point value by 4:
quarter dd 0.25
...
movups xmm1, [quarter]
mulps xmm0, xmm1
However, it does not perform such operation as wanted, since data from [quarter]
is taken as 16 bytes entity:
(gdb) p $xmm1
$2 = {v4_float = {0.25, 0.00200051093, 7.8472714e-44, 8.40779079e-45}
The obvious workaround would be to declare quarter
as four elements array, however I am curious, if there is some preffered way to either transfer or replicate first element? For instance:
movss xmm1, [quarter]
; some magic kung-fu
mulps xmm0, xmm1
Edit:
Thanks to the comments below, I ended up with shufps
:
movss xmm1, [quarter]
shufps xmm1, xmm1, 0 ; broadcast the least significant element
mulps xmm0, xmm1