1

In Shellcoder's Handbook: Discovering and Exploiting Security Holes, I found a comparison between this C code example:

    int number;
if (number<0)
{
...more code...
}

and its compiled assembly code (IA 32 architecture):

number dw 0
mov eax,number
or eax,eax
jge label
<no>
label :<yes>

what is the purpose of the or eax,eax command? Shouldn't it be comp eax,0 ?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • 2
    I'm guessing that it's just an alternative way of setting the status flags for a comparaison of the contents of the register to 0 (zero_flag will be set to true if `eax` contains 0, otherwise false). I don't have the tables, but it may also be cheaper in CPU cycles versus the `comp` instruction. Like, doing `xor eax, eax` to set a register to 0. – AntonH Jan 04 '18 at 21:34
  • 6
    `or eax, eax` will OR `eax` with itself and then set the arithmetic status flags of the CPU accordingly. Read the documentation for `OR` and see what it does. Then check the documentation for what `jge` actually does. Specifically, if `eax` is zero, the zero flag will be set. The combination of `or eax, eax` with `jge label` will check if `eax` is greater than or equal to zero and jump if it is. Doing `or eax, eax` followed by an instruction that acts on the flags is a very common pattern. – lurker Jan 04 '18 at 21:34
  • so, i checked better `jge`, it jumps if SF=OF. Being that OF is cleared in an OR operation, this should jump only if `number` is positive (SF = 0), thus avoiding the portion of the code inside the `if` statement. would this be correct? – Francesco C Jan 04 '18 at 22:24
  • 1
    @AntonH Probably not cheaper in cycles, but shorter in bytes, since it doesn't have to encode an immediate. – EOF Jan 04 '18 at 22:48
  • https://stackoverflow.com/a/38032818/4271923 (`or` is less performant option than `test` in this case) – Ped7g Jan 04 '18 at 23:09
  • @AntonH `or` **may** be slower than `cmp` because of creating false dependency on the result of `or`. While both `cmp` and `test` are known to CPU to discard the result of operation (except flags), so they don't create dependency on the result of operation itself. – Ped7g Jan 04 '18 at 23:11
  • 1
    The only reason to use `or eax,eax` would be if you need to avoid bytes with the high bit set so your shellcode can be plain ASCII text. [`or r/m32, r32` has opcode `0x0B`](https://github.com/HJLebbink/asm-dude/wiki/OR), but [`test r/m32, r32` has opcode `0x85`](https://github.com/HJLebbink/asm-dude/wiki/TEST). Using `OR` instead of `TEST` seems to be some kind of legacy habit among some programmers. And BTW, `cmp eax, 0` won't work in shellcode for most buffer overflows, because it contains a zero byte. (end of implicit-length string) – Peter Cordes Jan 05 '18 at 00:25
  • 1
    Also, that's some bogus code. `dw 0` is only a 16 bit `word`, so a `dword` load into `eax` will load 2 bytes beyond the end of `number`. It's also weird to put `number` right next to the instructions. – Peter Cordes Jan 05 '18 at 00:32
  • I think the or eax,eax falls into the category of xor eax,eax. (much) smaller encoding than a cmp or mov, saves on program space, a habit going way back to when it mattered a lot more. I would have to check the docs you could probably do an and eax,eax as well...at least on some instruction sets (rather than a compare with zero). – old_timer Jan 05 '18 at 02:04

1 Answers1

1

That kind of self-comparison is used to set flag values. The x86 instruction set often has equivalent ways of performing the same operation.

SoronelHaetir
  • 14,104
  • 1
  • 12
  • 23