2

I am new to assembly language. I am currently going through this Assembly Guide. I have doubt in LEA instruction. My understanding of LEA instruction is LEA loads the destination operand with the effective address of source operand

Below example is from the same link.

lea edi, [ebx+4*esi] — the quantity EBX+4*ESI is placed in EDI.
lea eax, [var] — the value in var is placed in EAX.
lea eax, [val] — the value val is placed in EAX.

In the above 2nd and 3rd example comment says the value is loaded in the EAX. This is my confusion. Kindly let me know if LEA instruction can be used to load the effective address as well as value in the destination operand.

Introduction to assembly language


The above image is from Introduction to assembly language from youtube's Open SecurityTraining channel.

Thank You.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Kanan Jarrus
  • 607
  • 1
  • 12
  • 26
  • 7
    2nd and 3rd are wrong. The address is placed into `eax`. When in doubt, consult the official reference (google for Intel Manual 2) – Margaret Bloom Feb 17 '18 at 15:24
  • 2
    The 2nd part of the question, added in an update, is an exact duplicate of https://stackoverflow.com/questions/46597055/address-computation-instruction-leaq/46597375#46597375: yes, LEA can be used to shift/add any values, even if the value isn't an address. The first part: resolving confusion caused by a wording error in an example in a guide / tutorial, is a valid question. Confusion caused by wrong guides / tutorials makes an otherwise-too-trivial question valid, IMO. – Peter Cordes Feb 17 '18 at 16:31

2 Answers2

5

LEA is an ALU shift+add instruction that uses addressing-mode syntax and machine-code encoding to take advantage of hardware support for decoding this kind of operation. (It doesn't care if the inputs are addresses or not, so the 2nd part of your question is answered there: lea eax, [ecx + eax*2] implements x*2 + y efficiently.)

Under no circumstances does lea ever load or store from memory. See Intel's instruction-manual entry for it.

Fun fact: the only exception it can take is if the ModR/M byte encodes a register source instead of memory. e.g. lea eax, eax would fault with #UD (UnDefined instruction) if your assembler didn't refuse to assemble it, or if you manually encoded it with a db pseudo-instruction. This is not something to worry about in practice if you're writing in asm, not in hex machine code. But there are no data-dependent exceptions it can take; it doesn't care at all what values it's operating on.


In the 3rd one, I think they're talking about something like val equ 4 or val = 4, so "the value of val" is an assemble-time constant, not stored in memory.

Yes you can use LEA for that (or any 32-bit constant), but mov eax, val is shorter and more efficient. Using an LEA with an absolute disp32 addressing mode is pointless.

Fun fact: MASM ignores [] around assemble-time constants: mov eax, [val] is a mov eax, imm32, the same as mov eax, val. Ross Ridge wrote a nice answer on Confusing brackets in MASM32.


lea eax, [var] — the value in var is placed in EAX.

The comment is wrong. The address of var is placed in EAX. In normal asm terminology, the value in a symbol name means the value stored in memory at that address.

mov eax, OFFSET var is more efficient and shorter than lea eax, [var]. Some people like to write lea because of the "semantic" meaning for human readers: taking the address. But if you're writing in assembly, human readability should come after efficiency, and only win as a tie-breaker, e.g. choosing esi for a source pointer when the choice makes no other difference. (Format / indent your code nicely, and comment it well.)

lea eax, [var + edi] would make sense, you can't do that with mov. But you can do mov eax, OFFSET var + 1234, because assemblers + linkers have to be able to support filling in a 32-bit symbol+offset values for addressing modes like [var + 1234].

In 64-bit mode, lea rax, [rel var] makes sense in position-independent code: you can't get a RIP-relative address with mov.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • thank you very much for the info on MASM ignoring the brackets. I have added an image. My doubt is again with the LEA instruction. Since LEA does not load or store from memory. Prior instruction to LEA in the example load the address of into eax and ecx, perform the addition and multiplication and store the result in eax. Which is still confusing me. – Kanan Jarrus Feb 17 '18 at 16:16
  • @vanquish: LEA is a shift-and-add instruction. It's how the compiler (correctly) chooses to implement `2*x + y`. The operands aren't addresses in this case; read the link in my first paragraph. It's really an unrelated question to your initial question about a misleading guide / tutorial, and is a duplicate of that link. – Peter Cordes Feb 17 '18 at 16:21
  • I like to add/object that `LEA` has its unique use in referencing local variables. In that case it cannot be replaced by `MOV reg, offset var`. So, as a rule of thumb, `offset var` is for compile/assemble time addressing and `LEA reg, var` is for RunTime addressing. – zx485 Feb 17 '18 at 16:43
  • @zx485: I'm used to NASM syntax, where you'd write that as `lea rax, [rsp+16]` or `[rbp-8]` or something, maybe using an assemble-time symbolic constant for the `16`. So yeah, that makes sense if you're using MASM that way to hide the difference between static and local variables. – Peter Cordes Feb 17 '18 at 16:56
  • Yes, you're right in that I prefer MASM syntax. But I don't "hide" the difference between locals and globals, I use `MOV reg, offset var` for globals and `LEA reg, var` aka `LEA reg, (D|Q)WORD PTR [EBP + k]` for locals. – zx485 Feb 17 '18 at 17:01
  • 1
    @zx485: It does hide the difference syntactically, but I guess if you're used to `var` maybe expanding to `[ebp+k]` then it doesn't look that weird. If you're hand-writing a function in asm, it makes sense to look for optimizations like reusing the same stack slot for spilling different things at different times, if you need to spill at all. And tying up `ebp` as a frame pointer in 32-bit mode is pretty costly, leaving only 6 GP registers. (But without a fixed EBP, the offset to the same spot in the stack will change with push/pop). With push/pop + ESP addressing, you get stack-sync uops :/ – Peter Cordes Feb 17 '18 at 17:08
2

LEA is literally "Load Effective Address." It computes an address first - called "effective" because it may be composed from several parts and the results is what effectively ends up in the destination - and loads that address somewhere.
While [var] uses an addressing mode, lea merely computes the address and stores it in the destined location, it doesn't use it to load something from main memory. "the value in var" is wrong, "the value var" is more on point, if "var" refers to the address of a label, not a stored value.

Since the source explicitly says that "[...] the contents of the memory location are not loaded, only the effective address is computed and placed into the register," I think this is just a misleading typo, not misinformation.

cadaniluk
  • 15,027
  • 2
  • 39
  • 67
  • *`lea` merely computes the address and stores it in the destined location* so are the below instructions same? if var was declared as `var dw 4` `lea eax, var` `mov eax, var` – Kanan Jarrus Feb 17 '18 at 15:47
  • 1
    `lea eax, var` is the same as `mov eax, offset var`. For memory operands like `var`, brackets are not necessary, so `lea eax, [var]` is the same as `lea eax, var`, IIRC. What you can do with `lea`, for example, is `lea eax, [ebx+esi*4]` to load `ebx+esi*4` into `eax`. To do that otherwise, you'd need something like `mov eax, esi; shl eax, 2; add eax, ebx`. – cadaniluk Feb 17 '18 at 15:53
  • thank you very much for this example `lea eax, [ebx+esi*4]` So `ebx` and `esi `could also contain the address from the memory right? – Kanan Jarrus Feb 17 '18 at 16:21
  • 1
    Yes, they may contain an address or whatever. The point is that the address `ebx+esi*4` is calculated, but instead of accessing RAM afterwards, it is simply stored in `eax`. – cadaniluk Feb 17 '18 at 16:22