4

Recently I've been trying to learn reverse-engineering. As such, I've been delving into a lot of assembly code. I'm mystified by the following:

movq    %rax,0xf8(%rbp)
movq    0xf8(%rbp),%rax

I've seen this several times. Is it not redundant? Why does the compiler do this? The binary I'm looking at was compiled with gcc.

Jarsen
  • 7,432
  • 6
  • 27
  • 26

1 Answers1

9

You probably compiled without optimization (-O). What you're seeing is a direct, naive translation of the intermediate representation. Snippets like this are usually due to the value being stored in a local variable, in this case 0xf8(%rbp). The value is then used immediately afterwards, so it loads it again into a register, %rax. The optimizer will spot that storing from %rax only to restore back to the very same register is redundant and remove the sequence altogether. If all optimization stages fail, at the very least a peephole will spot these two instructions being consecutive.

If you really do have optimization turned on, then this is indeed odd, but might be explained if you post a larger (but not excessively large) sequence. There's still plenty of cases where something blatantly sub-optimal will be generated, but nothing as blatant as that.

John Ripley
  • 4,434
  • 1
  • 21
  • 17
  • 4
    That assembly listing should be very easy to generate even at optimal settings by just declaring the local variable with **volatile** specifier. This is not something that you should find in real-life code, though... – LocoDelAssembly Feb 22 '11 at 06:46
  • 2
    although in this case this is probably the correct answer, one must note that redundent op's are also used to realign the code address to help the processing pipeline, this is pretty common just before 'intensive' loops. – Necrolis Feb 22 '11 at 10:45
  • 2
    It's pretty rare (and not the cause of your case), but such a sequence can be required in dealing with memory mapped devices. – Brian Knoblauch Feb 22 '11 at 20:25
  • 1
    @iManBiglari: `NOP` is only a single byte, its faster for the inst fetch-and-decode to decode an instruction that spans the entire realignment rather than one byte at a time (AMD has a listing somewhere of best case no-op insts to use from 2 to 15 bytes in length). – Necrolis Jan 14 '15 at 16:00