1

I'm trying to work through section 3.6.7 Self Modifying Code Exercises in Art of Assembly by hand and cannot help thinking that either I'm doing something very wrong with the high/low order byte interleaving (I am aware of how the author does this for his imaginary x86 variants) or that there is an error in the text.

To wit, the program in the text is

   sub  ax, ax
   mov  [100], ax

a: mov  ax, [100]
   cmp  ax, 0
   je   b
   halt

b: mov  ax, 00c6
   mov [100], ax
   mov  ax, 0710
   mov [102], ax
   mov  ax, a6a0
   mov [104], ax
   mov  ax, 1000
   mov [106], ax
   mov  ax, 8007
   mov [108], ax
   mov  ax, 00e6
   mov [10a], ax
   mov  ax, 0e10
   mov [10c], ax
   mov  ax, 4
   mov [10e], ax
   jmp 100

which supposedly "writes the following code to location 100 and then executes it:"

   mov  ax, [1000]
   put
   add  ax, ax
   add  ax, [1000]
   put
   sub  ax, ax
   mov [1000], ax
   jmp 0004  

However, this doesn't make sense to me for a number of reasons (first of all, it writes 00 to byte 100, which is an illegal instruction that is supposedly jumped to; second, the high/low order bytes in addresses seem to be switched, etc.) On the other hand, if I simply swap the high/low order bytes in the mov ax, c statements in the b loop, then this seems to be correct. So:

Am I wrong (if so, why?) or should the high/low order bytes in the b loop constants be swapped?

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
S Huntsman
  • 123
  • 5

1 Answers1

2

It doesn't store 00 at address 100, because the x86 processors are "little-endian". When you store 00c6 as a word, the c6 is stored first.

On the other hand, self modifying code is out of fashion since a long time. On modern processors, you cannot write to the code segment and you cannot execute data (at least not without jumping through a huge number of loops). Another problem is that the instruction cache has probably loaded the unmodified code long before you change it.

You either risk executing the old code, or will have to flush all the caches with a tremendous impact on performance. In short - self modifying code is just not worth it. I haven't used it since the PC/XT in the early 1980's.

Why don't you skip a chapter in the book?

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • Can you point out to me where AoA's "x86" processors detail their little-endianness? I want to distinguish between the notional 886/8286/8486/8686 in AoA and the real-world 80x86 processor family. – S Huntsman Sep 13 '12 at 20:53
  • Sorry, haven't read that book. I learned the x86 family from Intel's documentation (all the way from the 8088 :-). – Bo Persson Sep 13 '12 at 20:56
  • 2
    Modern x86 has coherent instruction caches. I think you meant "other modern architectures" or something. Code that self-modifies in a loop is garbage on x86, but write or modify once and then execute many times is the whole point of JIT. It's also how the `.plt` section works for dynamic linking on Linux (where a `jmp` target is rewritten by lazy dynamic linking). The Linux kernel also has stuff like `bool _static_cpu_has(u16 bit)` [Only pass if-statement once](https://stackoverflow.com/a/50165421) which patches the asm to a `jmp` or `nop` instead of cmp/jcc. – Peter Cordes May 15 '18 at 18:30
  • 1
    See [Observing stale instruction fetching on x86 with self-modifying code](https://stackoverflow.com/q/17395557) for more about how modern x86 snoops stores near EIP/RIP to keep the pipeline coherent. "The 80486 and earlier processors require a jump between the modifying and the modified code in order to flush the code cache." (but P5 and later detect SMC even without a `jmp`). Not that I'm encouraging people to write more SMC, just that it is a thing and doesn't require extra barriers / flushing on x86. It's a win for modify once / execute *many*. – Peter Cordes May 15 '18 at 18:33