I tried searching for 0156
in Agner Fog's table. Some instructions aren't exactly what you asked for, but seem worth mentioning.
I know you wanted to exclude mov
type instructions, but movsx r32, r16/r8
is definitely not eliminated, and definitely runs on any of the p0156 integer ALU ports. Similarly movsxd r64, r32
. Only mov r32,r32
, mov r64, r64
, and movzx r32, r8
can be eliminated (0 latency, no unfused-domain uop).
If you were ruling out movzx/sx
because of possible mov-elimination, look again at movsx
. It may be the only such instruction.
bextr r,r,r
is 2p0156. But it's probably actually p06 + p15
or something, implementing it with something like shift (p06) + BZHI (p15) uops. That hypothesis can be tested by mixing it with some shifts or p15 instructions.
xchg r64, r64
is 3 uops for p0156. According to my reverse-engineering, I think each uop is a reg-reg mov
that's not subject to mov-elimination, and actually needs an ALU port. One of the registers involved is an internal microcode-use-only register that's not architecturally visible, but does participate in register renaming. (I think we have other evidence that there are a few extra logical registers that don't have an x86 name, e.g. using up PRF entries). But of course neither destination of the whole x86 instruction is write-only. leave
also has 2p0156 (possibly not using the stack engine).
salc
is 3p0156
(set AL from carry: undocumented, not 64-bit mode) but that's probably sbb same,same
and a merging uop into RAX. So it's probably like lea r16, [m]
or imul r16, r/m16, imm
or movsx r16, m8
that also have a merging uop into an architecturally write-only destination.
movbe r64, m64
runs on 2p0156 p23
on SKL. But movbe r32, m32
runs on p15 p23
so there's probably just one extra p0156
uop in there, or a p06
uop. bswap r64
is p15 p06
so we can be pretty sure that's what movbe uses. I assume movbe r64, m64
is really p15 p06 p23
, i.e. load + bswap, but Agner didn't manage to pick that apart.
So other than movsx
and movzx dst, r16
, mostly this answer is debunking / ruling out possible p0156 instructions from Agner Fog's table.