I am trying to analyze my OpenCL kernel as compiled for an RDNA3 AMD GPU.
I use the Radeon GPU Analyzer for that.
When I load my OpenCL kernel in the analyzer, it displays the assembly instruction for it in gfx1102 (RDNA3) assembly.
So far, so good.
I have difficulty interpreting the instruction names, though. I can look them up in the ISA documentation, but often, the full instruction name is not listed.
In my kernel's inner loop, I do multiply-adds on 16 bit floating point values.
I see this translated into:
v_fmac_f16_e32 v?, v?, v?
Which seems appropriate, as I understand that the 'v' stands for vector, fmac for fused-multiply-add and f16 for the 16-bit float arguments.
But the document does not describe the _e32
suffix.
What is the meaning of the _e32
suffic in RDNA3 assembly?