What is the difference between "lea eax, [ebx + eax]" and "add eax, ebx" in x86-32 assembly?

Question

GCC made me some assembly code, and inside theres this statement:

lea eax, [ebx+eax]

(Intel Syntax) Just curious, what would the difference between that, and:

add eax, ebx

Be?

eax, and ebx contains return values from functions :)

mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [esp], eax 
call CALC1
mov ebx, eax.
mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [esp], eax
call CALC2
lea eax, [ebx+eax]

Possible duplicate of [LEA or ADD instruction?](http://stackoverflow.com/questions/6323027/lea-or-add-instruction) — Ciro Santilli OurBigBook.com, Nov 18 '15 at 11:12
@CiroSantilli包子露宪六四事件法轮功: Unfortunately the answers to that question aren't very accurate. The highest voted / accepted answer claims that `lea` runs on the AGU. In-order Atom works that way, (and needs its inputs ready earlier but produces outputs earlier in the pipeline, too, or something like that), but out-of-order CPUs including Silvermont aren't like that. I guess I should just answer that question. — Peter Cordes, Apr 16 '18 at 05:50
@PeterCordes Thanks for the info. I recommend that you answer that question if none of the answers is correct, comment on wrong answers (looks like you did already), and comment on question in the hope that OP will change the accept (unlikely). I don't think it should affect the duplicate direction in this case: that one just has way more upvotes. This is also a great opportunity for you to get rep due to the "create justice" effect :-) — Ciro Santilli OurBigBook.com, Apr 16 '18 at 09:02

NPE · Answer 1 · 2010-11-30T17:38:18.247

11

One difference that immediately springs to mind is that lea doesn't affect the flags, whereas add does.

It is impossible to say without seeing the rest of the assembly code whether this is of any relevance. It could simply be an artefact of the GCC's code generator (i.e. it could in fact be producing code for a more general case or just using lea as a more flexible add.)

edited Nov 30 '10 at 17:38

answered Nov 30 '10 at 17:24

NPE

486,780
108
951
1,012

Sedat Kapanoglu · Accepted Answer · 2010-11-30T17:46:16.597

6

You can put the result into another register than EAX, such as lea edx, eax + ebx. add cannot do it.

lea can also have an additional third operand such as lea eax, ebp + esi + 12 which makes it a handier alternative to add instruction.

You can also combine certain (word-sized) multiplication operations with it, such as lea eax, ebp + eax * 8.

Not to mention that it describes the intent better :)

edited Nov 30 '10 at 17:46

answered Nov 30 '10 at 17:41

Sedat Kapanoglu

46,641
25
114
148

Stephen Canon · Answer 3 · 2015-11-18T17:07:06.570

5

In terms of the numeric result, there is no difference.

However, there is more to an instruction than the actual result that is stored in the destination register:

As aix pointed out, lea does not set the flags based on the result of the addition. This is occasionally useful for instruction scheduling purposes.
There are also timing differences on some micro-architectures (early Atom cores); specifically, there are stalls involved in forwarding results between the arithmetic and address-generation units, and using either add or lea depending on context can eliminate these (very small) stalls.

edited Nov 18 '15 at 17:07

answered Nov 30 '10 at 17:42

Stephen Canon

103,815
19
183
269

In some older Intel processors, lea can be MUCH slower than add. Not sure about the current crop, but suspect that's probably no longer the case. – Brian Knoblauch Dec 01 '10 at 12:57
True; in AMD's Athlon and later, it might be beneficial - it's in fact a way to eek out another parallel op since `LEA` is handled by the addressing engine not the ALUs. – FrankH. Dec 01 '10 at 14:58
1

@Brian: Yes, both are single-cycle operations on all current Intel processors that I am aware of (though on some processors it is possible to issue *more* `add` instructions at once). – Stephen Canon Dec 01 '10 at 17:05
1

@FrankH: `LEA` runs on the ALUs in Intel (Haswell: p1/p5 for simple, p1 only for 3-component or rip-relative. p0 only for PII/PIII) and AMD CPUs (EX0/1 in Bulldozer-family). I even checked P4, and it runs `LEA` on the same ALU ports as `ADD`. Via Nano2/3000 runs `LEA` on its store-address port, though, vs. `ADD` on I1/I2 integer ALUs. [Agner Fog's table](http://agner.org/optimize/) says LEA has "extra latency to other ports", which sounds like what Stephen said about Atom (which also runs LEA on AGU1, rather than ALU0/1). Silvermont is like other OOO chips, though; LEA on ALUs. – Peter Cordes Nov 18 '15 at 11:33
@PeterCordes: right, what I wrote in the answer is only true of early Atom cores (Bonnell/Saltwell). Silvermont is much more modern. – Stephen Canon Nov 18 '15 at 13:34
1

Yup, of course. I wasn't criticizing your answer, just FrankH's response (and answer on http://stackoverflow.com/questions/6323027/lea-or-add-instruction). Atom and Silvermont are very different architectures. Fun fact, though: K8 and K10 run (complex) LEA in their AGUs. Your point 2 helped make some sense of those notes in Agner Fog's table about taking longer to forward results between AGU and ALU. (I haven't even read the Atom section in his microarch pdf, though, and it's probably mentioned there.) – Peter Cordes Nov 18 '15 at 14:04

What is the difference between "lea eax, [ebx + eax]" and "add eax, ebx" in x86-32 assembly?

3 Answers3

Linked

Related