What does LEA do internally?

Question

This question says that the LEA instruction can be used to do arithmetic.

As I understand it, instead of:

inc eax
mov ecx, 5
mul ecx
mov ebx, eax

I can simply write

lea ebx, [(eax + 1) * 5]

The LEA version is only a single instruction as opposed to the 4. However, I don't understand how LEA does the arithmetic without inc and mul.

Is the LEA version faster in any way, or does LEA simply run the original instructions in the background?

EDIT: This question is not about the purpose of LEA, but rather how it works internally.

`lea` does not use `mul` internally, because all multiplications in it can be done with a simple shift. Your "simple" way is technically `lea ebx, [eax + eax*4 + 5]`. — Jongware, Aug 28 '16 at 11:28
It does the exact same thing as `&array[i].field` does in a C program. In other words, generate a pointer to an array element whose element type is a struct, common when you need to pass it as an argument to a function for example. The x86 instruction set was formulated after researching the machine code produced by language compilers, first processor that considered that most code is written in a higher level language rather than assembly. Also produced the ebp register and quirky instructions like enter and leave. The instruction is implemented by the AGU, separate circuit from the ALU. — Hans Passant, Aug 28 '16 at 14:13
See: http://stackoverflow.com/questions/1699748/what-is-the-difference-between-mov-and-lea and http://stackoverflow.com/questions/6323027/lea-or-add-instruction. — Cody Gray - on strike, Aug 28 '16 at 14:29
Yes, LEA is much faster than 4 instructions. See [Agner Fog's insn tables](http://agner.org/optimize/), and other perf links in the [x86 tag wiki](http://stackoverflow.com/tags/x86/info). — Peter Cordes, Aug 28 '16 at 20:43
Possible duplicate of [What's the purpose of the LEA instruction?](http://stackoverflow.com/questions/1658294/whats-the-purpose-of-the-lea-instruction) — Johan, Aug 28 '16 at 23:57

NPE · Accepted Answer · 2016-08-28T11:33:55.613

6

I don't understand how LEA does the arithmetic without INC and MUL.

The key is that LEA can only do certain types of arithmetic: the sort that's useful in computing memory addresses for things like array access.

Think of the CPU as having specialized circuitry for this sort of address computations, and LEA as the instruction for using that circuitry.

You can read up on the x86 addressing modes in Wikipedia. It shows that types of computations you can perform with LEA.

In a nutshell, it's a register multiplied by 1/2/4/8, plus a register, plus a constant. The only reason you can compute (eax + 1) * 5 with LEA is that it's equivalent to eax * 4 + eax + 5. If you, for example, were trying to compute (eax + 1) * 6, LEA would not be of any help, while the generic MUL and ADD would still work just fine.

edited Aug 28 '16 at 11:33

answered Aug 28 '16 at 11:26

NPE

486,780
108
951
1,012

2

Modern Intel CPUs actually do not have specialized circuitry for these address computations. The ALU executes all of the arithmetic required for LEA. You have to go back to either old Intel chips (Pentium III and earlier), or AMD K8/K10, in order for LEA to actually run on the AGU (address generation unit). – Cody Gray - on strike Aug 28 '16 at 14:37
2

@CodyGray: in-order Atom (pre-Silvermont) runs LEA on the actual AGU, earlier in the pipeline than the ALUs. So it needs its inputs ready earlier, but produces outputs sooner, as far as latency is concerned. But note that modern CPUs *do* have dedicated AGUs to handle load and store-address uops; they just don't use them for LEA. I think that's what you meant to say, but your first sentence came out wrong. – Peter Cordes Aug 28 '16 at 20:42

What does LEA do internally?

1 Answers1