What is the purpose of XORing a register with itself?

Question

xor eax, eax will always set eax to zero, right? So, why does MSVC++ sometimes put it in my executable's code? Is it more efficient that mov eax, 0?

012B1002  in          al,dx 
012B1003  push        ecx  
    int i = 5;
012B1004  mov         dword ptr [i],5 
    return 0;
012B100B  xor         eax,eax

Also, what does it mean to do in al, dx?

It's very unlikely that the MSVC++ compiler actually emits an"in" instruction. You're probably disassembling at a wrong address / wrong alignment. — newgre, Sep 08 '09 at 22:01
Yes, the real instructions starts a few bytes earlier. There is no C-equivalent of the "in" instruction, and reading from a 16 bit I/O port and overwriting the result a few instructions later is a very unlikely generated instruction sequence. — Gunther Piez, Sep 08 '09 at 22:06
A very very similar question: http://stackoverflow.com/questions/1135679/does-using-xor-reg-reg-give-advantage-over-mov-reg-0/1135820 — sharptooth, Sep 11 '09 at 06:56
An interesting tips&tricks document from the past and recently emerged is "86fun.doc" from the MS WinWord 1.1 Source (http://www.computerhistory.org/_static/atchm/microsoft-word-for-windows-1-1a-source-code/). The file is located in 'OpusEtAl\cashmere\doc' and describes "best/fast pratices" of assembler programming, also mentioning the xor bx,bx practice. — ChristianWimmer, Mar 27 '14 at 06:46
This question has multiple duplicates: http://stackoverflow.com/questions/33666617/which-is-best-way-to-set-a-register-to-zero-in-x86-assembly-xor-mov-or-and (which has a detailed answer with some microarchitectural background) and http://stackoverflow.com/questions/1135679/does-using-xor-reg-reg-give-advantage-over-mov-reg-0/1135820 at least. Also http://stackoverflow.com/questions/17981447/microarchitectural-zeroing-of-a-register-via-the-register-renamer-performance-v addresses the microarchitectural reasons (and compares SnB's xor at reg-rename with IvB's mov-elimination (at reg rename)). — Peter Cordes, Dec 12 '15 at 00:19

Gunther Piez · Accepted Answer · 2012-03-19T10:52:56.990

192

Yes, it is more efficient.

The opcode is shorter than mov eax, 0, only 2 bytes, and the processor recognizes the special case and treats it as a mov eax, 0 without a false read dependency on eax, so the execution time is the same.

edited Mar 19 '12 at 10:52

answered Sep 08 '09 at 21:59

Gunther Piez

29,760
6
71
103

55

"processor regonizes the special case and treats it as a "mov eax,0" without a false read dependency on eax, so the execution time is the same" The processor actually does even better: it just executes a register rename internally, and doesn't even do anything at all with `eax`. – kquinn Sep 08 '09 at 22:16
2

Actually, in the big picture it's faster. There are fewer bytes that have to be fetched from RAM. – Loren Pechtel Dec 07 '09 at 00:21
13

preventing generate null byte opcode also ;) by doing `xor eax, eax` – Yuda Prawira Apr 30 '11 at 21:22
6

in modern architectures xor will be faster because the register is set to zero at the rename stage without using any execution unit http://stackoverflow.com/a/18027854/995714 – phuclv Mar 06 '14 at 15:01

score 34 · Answer 2 · edited Sep 18 '17 at 10:07

34

Also to avoid 0s when compiled as used on shell codes for exploitation of buffer overflows, etc. Why avoid the 0 ? Well, 0 represents the end of string in c/c++ and the shell code would be truncated if the mean of exploitation is a string processing function or the like.

Btw im referring to the original question: "Any reason to do a “xor eax, eax”?" not what the MSVC++ compiler does.

Since there's some debate in the comments about how this is pertinent in the real world, see this article and this section on Wikipedia.

edited Sep 18 '17 at 10:07

GDP2

1,948
2
22
38

answered Sep 08 '09 at 22:18

kripto_ash

919
7
14

12

This sounds like nonsense to me. There are bound to be zero bytes somewhere in your code, so I don't see how one more would make much difference. Anyway, who cares if you can trick a program into reading code as data. The real problem is executing data as code. – Stephen C Sep 08 '09 at 22:36
30

Who cares? Hackers do, and apparently most of the computer security related industry. Please educate yourself before voting down on something. You can find more references here [The Art of Exploitation - Chapter 0x2a0][1] as well as sample shell code that doesn't contain 0s. [1] [http://books.google.es/books?id=P8ijosP6ti4C&lpg=PP1&dq=the%20art%20of%20exploitation&pg=PA84#v=onepage&q=&f=false] – kripto_ash Sep 08 '09 at 23:13
2

This brings back memories from the TRS-80. Some of us would embed assembly routines inside BASIC strings. There were a few characters that absolutely could not appear in the source code without breaking it and so any such routine had to be carefully optimized to avoid using those characters. – Loren Pechtel Dec 07 '09 at 00:23
4

I don't know why this gets downvoted so many times, wtf. Down voters, please educate yourself about this _MOST BASIC TECHNIQUE/KNOWLEDGE_ in shellcodes before downvoting. – kizzx2 Aug 08 '11 at 02:58
5

@kizzx2 probably because no one here has explained how a string was being parsed in the `.text` segment. I also can't see how terminating a string somehow allows someone to move the `.data` segment to mirror the `.text` segment to modify anything in the first place. Please be more specific than *"MOST BASICIST TECHNIQUE"* – Hawken Oct 15 '12 at 22:07
1

@Hawken kripto_ash has mentioned that it is answering the "Any reason to do xor" part, and avoiding null is a valid reason. Now one could downvote on the reason that it is "off topic" but no one seems to have said it clearly and it feels like people downvote because they don't believe null bytes matter at all in a security context, so my rant. – kizzx2 Oct 16 '12 at 02:24
2

@kizzx2 From what I found on shellcode though, avoiding null bytes is important for *writing* shell code not preventing code from being exploited. Can you please *explain* how having null bytes in the instruction segment makes said code easier to exploit? – Hawken Oct 17 '12 at 11:46
1

@Hawken "Btw im referring to the original question: "Any reason to do a “xor eax, eax”?" not what the MSVC++ compiler does." – kizzx2 Oct 17 '12 at 14:45
7

@kizzx2 could you please give an **explanation** as to how having null bytes in your program's instruction segment makes it more easily exploited. Having null bytes only affects string parsing, as far as I know, nothing parses the instruction segment of a program. Please **explain**, not quote some irrelevance about using msvc++ or not – Hawken Oct 18 '12 at 00:21
1

@Hawken AFAIK it doesn't in the usual case, but IMO that's a strawman. This answer adds another reason why a compiler might want to avoid null bytes, not necessarily the most "logical" answer. Here one could argue that the answer portraits MSVC as a tool to make writing shell code easier for exploit developers. – kizzx2 Oct 18 '12 at 02:12

score 20 · Answer 3 · edited Mar 19 '12 at 11:10

20

xor eax, eax is a faster way of setting eax to zero. This is happening because you're returning zero.

The in instruction is doing stuff with I/O ports. Basically reading a word of data from the port specified dx in and storing it in al. It's not clear why it is happening here. Here's a reference that seems to explain it in detail.

edited Mar 19 '12 at 11:10

Konrad Rudolph

530,221
131
937
1,214

answered Sep 08 '09 at 21:56

i_am_jorf

53,608
15
131
222

10

"The in instruction is doing stuff with I\O ports". But in this case, it is probably an "artifact" caused by the debugger starting disassembly in the middle of an instruction. – Stephen C Sep 08 '09 at 22:30
6

I agree. But still, that's what it does. – i_am_jorf Sep 09 '09 at 00:37
1

@Abel Why did you rename all the mnemonics and register names? That’s unconventional to say the least. As you can see in OP’s code, most modern assemblers and disassemblers use all-lowercase spelling. – Konrad Rudolph Mar 19 '12 at 11:09
1

@Konrad I stand corrected. My asm books, including the processor references of Intel (all > 5yrs old), (EDIT: and apparently [Wikipedia](http://en.wikipedia.org/wiki/Assembly_language)), use uppercase only, wasn't aware this convention was changed. – Abel Mar 19 '12 at 11:12
@abelmar also uppercase is not allowed in AT&T syntax if I remember correctly – Hawken Apr 01 '12 at 02:44

score 0 · Answer 4 · answered Apr 26 '13 at 09:06

0

Another reason to use XOR reg, reg or XORPS reg, reg is to break dependency chains, this allows the CPU to optimize the parallel execution of the assembly commands more efficiently (even it it adds some more instruction throughput preasure).

answered Apr 26 '13 at 09:06

Quonux

2,975
1
24
32

2

That gives it an advantage over `AND reg, 0`, but not over `MOV reg, 0`. Dep-chain breaking is a special case for `xor`, but always the case for `mov`. It doesn't get mentioned, leading to occasional confusion from people thinking that `mov` has a false dependency on the old value being overwritten. But of course it doesn't. – Peter Cordes Dec 12 '15 at 00:07
any refernce for this, I dont know the ist of dep breakers on top of my head. – Quonux Dec 12 '15 at 16:52
Everything that overwrites the destination without depending on it breaks the dep chain. e.g. every 3-operand `op dest, src1, src2` instruction (e.g. `VPSHUFB dest, src1, src2` or `lea eax, [rbx + 2*rdx]`) breaks the dep chain on the old value of `dest`. It's only notable when there's a false dependency on the old value: like `mov ax, bx`, which (on AMD/Silvermont/P4, but not P6/SnB) has a false dep on the old value of `eax`, even if you never read `eax`. On Intel, the big notable one is that [`popcnt/lzcnt/tzcnt` have a false dep on their output](http://stackoverflow.com/a/25089720/224132) – Peter Cordes Dec 12 '15 at 17:38
Of course, `mov ax, bx / mov [mem], eax` has a dependency on the previous value of `eax`, but it's *not* a *false* dependency. You're actually using those bits, so it's a *true* dependency. – Peter Cordes Dec 12 '15 at 17:43
@PeterCordes am I right in thinking that `mov eax, 0` was always set to 0 by the allocate stage (therefore though larger was faster than `xor eax, eax`), whereas `xor eax, eax` wasn't until Sandy Bridge where the renamer recognised the false dependency, zeroes the allocated register itself and retires the instruction. – Lewis Kelsey Apr 12 '20 at 11:03
@LewisKelsey: No, `xor eax,eax` was dep-breaking on P6 family in later Pentium-M and therefore equally fast in the back end, but being 2 bytes instead of 5 made it faster in the front-end (which was a big deal before SnB-family's uop cache). And xor-zeroing avoided partial-register slowdowns on *all* P6-family CPUs. According to Agner Fog's optimization guide, it was occasionally worth using `mov eax,0` ; `xor eax,eax` on early P6-family like PPro / PIII to break a dep chain *and* put the register into an AL=EAX internal state. – Peter Cordes Apr 12 '20 at 14:22
1

@LewisKelsey: See my answer on the linked duplicate ([What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/q/33666617)) for the full details, including the early P6-family stuff. There are a few AMD CPUs like IIRC Bulldozer family where `mov reg,imm` can run on AGU ports as well as ALU where mov could have a back-end throughput advantage over xor zeroing for some surrounding code. But compilers always just use xor-zeroing when tuning for anything, and IMO that's the correct decision. – Peter Cordes Apr 12 '20 at 14:22

score -1 · Answer 5 · answered Sep 08 '09 at 22:01

The XOR operation is indeed very fast. If the result is to set a register to zero, the compiler will often do it the fastest way it knows. A bit operation like XOR might take only one CPU cycle, whereas a copy (from one register to another) can take a small handful.

Often compiler writers will even have different behaviors given different target CPU architectures.

score -1 · Answer 6 · answered Dec 11 '15 at 23:46

-1

from the OP > any reason to do "xor eax,eax" return 0; 012B100B xor eax,eax ret <-- OP doesn't show this

The XOR EAX,EAX simply 0's out the EAX register, it executes faster than a MOV EAX,$0 and doesn't need to fetch immediate data of 0 to load into eax

It's very obvious this is the "return 0" that MSVC is optimizing EAX is the register used to return a value from a function in MSVC

answered Dec 11 '15 at 23:46

James Foote

9

1

This answer doesn't add anything that isn't in the other answers. And yes, all the x86 ABIs use `eax` / `rax` as the register for return values. Also, immediate data doesn't have to be fetched, other than as a pre-requisite for instruction decoding. xor is shorter than mov, and even leaves more spare space in the uop cache line it's in, but neither of those effects are well described as "not having to fetch". – Peter Cordes Dec 12 '15 at 00:10

score -6 · Answer 7 · answered Aug 05 '14 at 09:14

xor is often used to encrypt a code for example

      mov eax,[ecx+ValueHere]
      xor eax,[ecx+ValueHere]
      mov [ebx+ValueHere],esi
      xor esi,[esp+ValueHere]
      pop edi
      mov [ebx+ValueHere],esi

The XOR instruction connects two values using logical exclusive OR remember OR uses inclusive OR To understand XOR better, consider those two binary values:

      1001010110
      0101001101

If you OR them, the result is 1100011011 When two bits on top of each other are equal, the resulting bit is 0. Else the resulting bit is 1. You can use calc.exe to calculate XOR.

Sure but that's not what the question is about, it's about "xor with itself" — harold, Aug 05 '14 at 09:16

What is the purpose of XORing a register with itself?

7 Answers7

Linked

Related