0

I read the manual description of those two operations but don't understand the difference yet. Can someone explain with an example how shufpd compares to pshufd?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
stht55
  • 390
  • 1
  • 8

1 Answers1

5
  1. pshufd shuffles 32 bits as a unit. shufpd shuffles 64 bits as a unit.
  2. pshufd shuffles within a single register. shufpd can merge-shuffle 2 registers.
  3. pshufd has a separate source and destination operand, so you can sometimes save a copy between registers.
  4. They can be used to do the same task, but mixing integer and floating point instructions (pshufd with floating points, or shufpd with integers) may cause a bypass delay.

Below is a copy paste from the Intel docs explaining each operation with pseudocode. The difference is very clear when you read carefully.

pshufd a, a, imm8
  DEFINE SELECT4(src, control) {
    CASE(control[1:0]) OF
    0:  tmp[31:0] := src[31:0]
    1:  tmp[31:0] := src[63:32]
    2:  tmp[31:0] := src[95:64]
    3:  tmp[31:0] := src[127:96]
    ESAC
    RETURN tmp[31:0]
  }
  dst[31:0] := SELECT4(a[127:0], imm8[1:0])
  dst[63:32] := SELECT4(a[127:0], imm8[3:2])
  dst[95:64] := SELECT4(a[127:0], imm8[5:4])
  dst[127:96] := SELECT4(a[127:0], imm8[7:6])

shufpd a, b, imm8
  dst[63:0] := (imm8[0] == 0) ? a[63:0] : a[127:64]
  dst[127:64] := (imm8[1] == 0) ? b[63:0] : b[127:64]

Examples?

a = [1, 1, 2, 2]
b = [3, 3, 4, 4]

shufpd a, b, 1 -> [2, 2, 3, 3]

You cannot do this with pshufd, but sometimes both can be used for the same task.

a = [1, 1, 2, 2]

pshufd a, a, 0x4e -> [2, 2, 1, 1]
shufpd a, a, 1 -> [2, 2, 1, 1]
xiver77
  • 2,162
  • 1
  • 2
  • 12
  • 2
    `pshufd` can copy-and-shuffle, sometimes saving a `movdqa` if you still need the old version. `shufpd`'s destination is read-write, and necessarily contains one element from it (as the low half). But yes there is a subset of use-cases where either one is usable. (Or `movhlps` or `unpckhpd`) – Peter Cordes Jun 12 '22 at 19:28