I am currently investigating some strange behaviour with the imul
instruction, because the official Intel manual seems to differ slightly from reality.
The first thing I noticed is that the Intel manual does not consider this example to be a correct instruction:
imul rax, 2
yet both GCC/GAS (with .intel_syntax noprefix
) and NASM accept this instruction without problem. Using objdump -d
showed me this:
48 6b c0 02 imul $0x2,%rax,%rax
meaning it gets translated into a different instruction that is in fact documented in the manual.
I already find this weird and would like to know, why this even exists. The only place I could find this documented was in the NASM instruction set and, weirdly enough, in the Description of the imul
instruction in the Intel manual. The latter reads:
- Two-operand form — With this form the destination operand (the first operand) is multiplied by the source operand (second operand). The destination operand is a general purpose register and the source operand is an immediate value, a general-purpose register, or a memory location. The intermediate product (twice the size of the input operand) is truncated and stored in the destination operand location.
That is inconsistent with the Opcode table of that same instruction.
The NASM instruction set also mentions imul reg64, sbytedword
and imul reg64, imm
instructions, neither of which I understand what they mean. imm
would imply that 64-bit immediates could be used as well, would it not? And the meaning of sbytedword
is unclear to me.
Now to the 32-bit immediates: The NASM instruction set mentions imul reg64, imm32
while both the Intel manual and the NASM set mention imul r64, r/m64, imm32
. However, normally when an immediate of a lower bitcount than the destination operand is used, the Intel manual specifically mentions sign-extension in the description column of the Opcode table. In this case, it is not mentioned, so I wondered what would happen if I happened to use a negative 32-bit immediate (in other words, requiring all 32 bits).
This is the assembly code I tested this with:
global imm_test
section .text
imm_test:
mov rax, rdi
imul rax, 0xFFDFFFFF
ret
Then I called the imm_test function from C:
#include <stdio.h>
int imm_test(int n);
int main() {
printf("%d\n", imm_test(1));
return 0;
}
If that 32-bit immediate were to be sign-extended, the value I would assume would have to be printed is -2097153
, which, when using NASM to assemble and GCC to compile and link, is exactly what is printed.
And yet NASM gives me this warning:
test.asm:7: warning: signed dword immediate exceeds bounds [-w+number-overflow] test.asm:7: warning: dword data exceeds bounds [-w+number-overflow]
However, looking at the disassembly again, the instruction is encoded exactly the way I would expect it to be:
48 69 c0 ff ff df ff imul $0xffffffffffdfffff,%rax,%rax
It's a 32-bit immediate sign-extended to 64-bit.
When I change the syntax of the assembly code to GAS's .intel_syntax noprefix
like so:
.intel_syntax noprefix
.global imm_test
.text
imm_test:
mov rax, rdi
imul rax, 0xFFDFFFFF
ret
and try to assemble this with the GNU assembler, I don't just get a warning, I get an error:
test.S: Assembler messages: test.S:8: Error: operand type mismatch for `imul
Changing the imul
instructions to the properly documented imul rax, rax, 0xFFDFFFFF
version does not change anything.
So I'm wondering, why is the documentation for imul
so inconsistent, and why are 32-bit immediates officially supported (and also work correctly), yet they give errors or warnings?