0

I am interested in understanding the difference between an instruction like add and one such as mul.

When writing x86 assembly with NASM, there is an add mneumonic which takes an immediate value as an "argument". There is no such equivalent instruction for multiply mul. The options for mul are either a value obtained from a memory location or a value obtained from another register.

What is the reason for the difference between the two instructions?

I believe it could be due to two reasons:

  • A software reason: NASM implements the mneumonics for the two instructions differently. Think about it like this: Technically, NASM translates text .asm files into binary .o files, and it could implement one mneumoic per machine instruction, or it could not implement some machine instructions. I guess it could also implement multiple forms of a mneumoic for each instruction. (and arguably does so if you consider there are multiple different forms of each instruction for the combinations of valid operands.)
  • So possibly there is a mul instruction which takes an immediate value as an operand, but it isn't implemented in NASM.
  • A hardware reason: (I consider this more likely.) There is a physical add instruction which takes an immediate value as an operand. There is no such value for multiply.

I assume my second theory is more likely? (Or possibly there is another reason.)

My only theory as to the details of the reason for why the hardware implementations of machine instructions are different? Multiply is possibly a more complex machine instruction which cannot be completed in a single clock cycle? But why that would mandate that the hardware implementation of each instruction be different? That I cannot explain. In my view, any immediate value could be "registered" in a non-programmable register.

FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225
  • [There is no `mul` taking a immediate operand](https://www.felixcloutier.com/x86/mul) – tkausl May 22 '22 at 18:09
  • What immediate do you want? Multiplication for lots of small immediates can be handled by `lea` or `shl`, or other instructions & instruction pairs. – Erik Eidt May 22 '22 at 18:11
  • I did read it. Your title is asking `Is there no instruction for multiply by an immediate value?` And your question states `I believe it could be due to two reasons:`, so I'm not sure why you're now saying __I know__. – tkausl May 22 '22 at 18:11
  • @tkausl Will edit the title – FreelanceConsultant May 22 '22 at 18:12
  • 1
    `imul r, r/m, imm` exists. It's much rarer to need a widening multiply. Basically a duplicate of [Is it possible to multiply by an immediate with mul in x86 Assembly?](https://stackoverflow.com/q/20499141) if that's what you're asking. – Peter Cordes May 22 '22 at 18:14
  • @ErikEidt Ah, does this mean that `mul immediate` would be an exact duplicate of `lea`? I am not sure what the relevance of `shl` is, unless this is specifically for multiply by 2? In other words, `lea blaa blaa` would be equivalent to a multiply instruction with an immediate? – FreelanceConsultant May 22 '22 at 18:15
  • @FreelanceConsultant Not exactly, but it can be used to implement some kinds of multiplication by immediates. – fuz May 22 '22 at 18:17
  • The relevance of LEA is tricks like [How to multiply a register by 37 using only 2 consecutive leal instructions in x86?](https://stackoverflow.com/q/46480579). Or for a single LEA, multiply by 2, 3, 4, 5, or 8, or 9. (e.g. `lea eax, [rdi + rdi*8]`) – Peter Cordes May 22 '22 at 18:17
  • @PeterCordes I guess this opens up a whole new bag of questions - why is there more than one multiply instruction? – FreelanceConsultant May 22 '22 at 18:17
  • 1
    8086 legacy of mul/imul (different high half depending on signedness) being so slow in original 8086 that a few instructions before/after isn't a big deal. Then adding newer instructions for the common case of non-widening multiply in 186 (immediate) and 386 (`imul reg, r/m`). [problem in understanding mul & imul instructions of Assembly language](https://stackoverflow.com/q/1948058) covers a bit of that. – Peter Cordes May 22 '22 at 18:18
  • @FreelanceConsultant Different multiplication instructions do different things. Other instructions perform multiplications but aren't named multiplication instructions because they are meant for other things. – fuz May 22 '22 at 18:19
  • That's interesting, thank you both I will do some more searching – FreelanceConsultant May 22 '22 at 18:22
  • BTW, your guess about NASM translating source lines to multiple machine instructions or not could be checked with a disassembler. `nasm -felf64 foo.asm && objdump -drwC -Mintel foo.o`. (`ndisasm -b64` only works well on a flat binary). Unlike MIPS assemblers, NASM doesn't do pseudo-instructions, and x86 doesn't have any spare registers it could use for mov-immediate anyway. It only accepts source lines that can become a single machine instruction. (It does disentangle `lea eax, [rcx*9]` to `lea eax, [rcx + rcx*8]` for you, though.) – Peter Cordes May 22 '22 at 18:25
  • Anyway, I think phuclv's answer on [problem in understanding mul & imul instructions of Assembly language](https://stackoverflow.com/a/19783509) is probably the best duplicate that covers the whole situation of what's available. – Peter Cordes May 22 '22 at 18:38
  • I was asking specifically what immediate you want to use (because it matters)? Or maybe you want all of them within some range? – Erik Eidt May 23 '22 at 00:19
  • @ErikEidt Not sure I understand the question? Does this help: `mul 10` ? – FreelanceConsultant May 23 '22 at 08:19
  • Given a value in `rdi` use `lea rax, [rdi+rdi*4]` to multiply by 5, then `add rax, rax` doubles for 10. – Erik Eidt May 23 '22 at 14:24
  • @ErikEidt Is that likely to be faster than `lea rax, [rdi*10]` or `lea rax, [rdi+rdi*9]`? (Are both of those valid?) – FreelanceConsultant May 23 '22 at 15:18
  • Neither of those are valid: you'll have to wait for the hardware designers to implement that, so could take a long time, whereas the form I'm showing is already implemented, so will be faster. – Erik Eidt May 23 '22 at 15:21
  • @ErikEidt And it isn't possible to do this? `lea rax, [0 + rdi*10]` ? – FreelanceConsultant May 23 '22 at 15:23
  • Try it out and see. – Erik Eidt May 23 '22 at 15:24
  • @ErikEidt Next time I have access to the VM - I will do that – FreelanceConsultant May 23 '22 at 16:10
  • @ErikEidt: "wait for the hardware designers to implement that" is a weird way to explain the problem. That would take a redesign of x86-64 machine code! Indexed addressing modes have a 2-bit shift count, allowing multipliers of 1, 2, 4, or 8. It wouldn't make much sense to include an arbitrary multiplier in addressing modes; that would increase address-generation latency, and we already have imul-immediate as one separate instruction. Re: multiplying by 10, yes, `x*2*5` is good, as in [NASM Assembly convert input to integer?](https://stackoverflow.com/a/49548057) – Peter Cordes Jun 15 '22 at 01:23
  • @PeterCordes, yeah, perhaps my comment was a bit sardonic as the OP continues to ask things that are not currently (and never will be?) supported, but could hypothetically be done. My bad. – Erik Eidt Jun 15 '22 at 01:26
  • @ErikEidt: Ah right, I see the sarcasm now. But Intel syntax especially makes it non-obvious that it's actually a 2-bit shift count, not an integer multiplier. Of course, they could have just tried their `*10` idea in an online assembler like https://godbolt.org/ or https://defuse.ca/online-x86-assembler.htm – Peter Cordes Jun 15 '22 at 01:30
  • I'm pretty sure that I did do this – FreelanceConsultant Jun 15 '22 at 10:22

0 Answers0