2

I was reading the book Programming from the Ground Up by Jonathan Barlett for learning i386 assembly on Linux

My purpose was to read some project's soure code which was written in asm, then i met this LODSL, from the manual i could know it load data from where %esi point to, and after that increate the address size

So why cant people just use movl to do that? are that any speed improvement or any other issue i haven't considered?

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
jyf1987
  • 135
  • 13
  • _"so why cant people just use movl to do that?"_ They could. But if there's a way to do the same thing with just one instruction (and it's not going to have any significant negative impact on performance), then why not use that instead? – Michael Oct 21 '16 at 08:31
  • x86 is considered using a CISC instruction set, there are a whole lot of instructions that could be removed (e.g. SUB isn't needed). Still it's good to have them; interesting would be to compare the microcode for it – Tommylee2k Oct 21 '16 at 08:35
  • 1
    Not only could people use the mov/add sequence, they should do so, as the sequence is faster than lodsl. That said, the difference between the two is that `add` sets the flags whereas `lods` doesn't. `lods` also behaves differently depending on the direction flag, whereas mov/add always does the same thing. – fuz Oct 21 '16 at 09:28
  • @Tommylee2k: LODSD isn't actually microcoded, but it decodes to 3 uops (or 2 on Haswell and later). Only instructions that decode to more than 4 uops have to turn on the microcode sequencer instead of just being decoded directly, on Intel CPUs. ([And yes, this matters for performance, potentially a lot](http://stackoverflow.com/questions/26907523/branch-alignment-for-loops-involving-micro-coded-instructions-on-intel-snb-famil)) – Peter Cordes Oct 21 '16 at 10:00

1 Answers1

5

so why cant people just use movl to do that?

code-size, and ADD modifies flags. (Although you can avoid that by using LEA for the pointer increment).

One of the major reasons for the existence of most complex single-byte instructions is that 8086 was almost completely bottlenecked on code-fetch. Besides the fact that memory was precious in general, code size ~= code speed on the first generation of x86 CPUs. That's definitely not the case on modern CPUs, with fast instruction caches and power-hungry decoders, and even caches for decoded instructions.

Having one-byte instructions for exchange-register-with-AX is a huge waste of 8 precious opcodes for modern x86, but was apparently useful for 8086 since MOVSX didn't exist until 386 (so you needed CBW), and other stuff required AX. (And XCHG wasn't 3x worse throughput than MOV like it is now). Fun fact: 0x90 NOP comes from this encoding of xchg eax, eax.

are that any speed improvements

Yes, code-size always matters.

Also, on Intel P6-family and Sandybridge-family, LODSD (aka lodsl in at&t syntax) is 3 uops until Haswell. On Haswell, LODSD/Q is only 2 uops. (LODSB/W is still 3 uops). See Agner Fog's instruction tables and microarch pdf, and other links in the tag wiki, like Intel's optimization manual.

So until Haswell, it's probably best to use separate MOV and ADD instructions unless code-size is really important (e.g. in a bootloader, where speed is nearly irrelevant).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • thanks for the explaintation. personally i prefer mips tyle asm lang. but i am now got the answer from you, the not setting flag might be very useful when you do some iteration over an array. but for the speed, it seems @FUZxxl has a different thought on this in my question's comment. – jyf1987 Oct 23 '16 at 05:31
  • @jyf1987: Normally there's no need to avoid setting flags, but it's useful in some situations. Most x86 instructions do set flags, and it doesn't make them slower. Read Agner Fog's Optimizing Assembly guide to learn about writing efficient asm. (I think FUZxxl was thinking of LODSD on CPUs before Haswell, where it's always less efficient than ADD+MOV.) I wouldn't generally recommend using LODSD, but the question was "what's the difference", not "should I use it". – Peter Cordes Oct 23 '16 at 05:49