85

Can someone explain to me how XOR swapping of two variables with no temp variable works?

void xorSwap (int *x, int *y)
{
    if (x != y) {
        *x ^= *y;
        *y ^= *x;
        *x ^= *y;
    }
}

I understand WHAT it does, but can someone walk me through the logic of how it works?

unwind
  • 391,730
  • 64
  • 469
  • 606
mmcdole
  • 91,488
  • 60
  • 186
  • 222
  • 9
    I think the xor variable swap sucks on out-of-order execution cores. Each subsequent xor has a read-after-write dependency, and needs to wait for the answer to complete. for x86, you're better off just coding as normal. The compiler should emit something decent. – Calyth Jan 09 '09 at 23:41

12 Answers12

146

You can see how it works by doing the substitution:

x1 = x0 xor y0
y2 = x1 xor y0
x2 = x1 xor y2

Substituting,

x1 = x0 xor y0
y2 = (x0 xor y0) xor y0
x2 = (x0 xor y0) xor ((x0 xor y0) xor y0)

Because xor is fully associative and commutative:

y2 = x0 xor (y0 xor y0)
x2 = (x0 xor x0) xor (y0 xor y0) xor y0

Since x xor x == 0 for any x,

y2 = x0 xor 0
x2 = 0 xor 0 xor y0

And since x xor 0 == x for any x,

y2 = x0
x2 = y0

And the swap is done.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • I have no ideea if you gonna see this comment 11 years later but this is the best explanation ever thank you! – Cantaff0rd Jan 28 '19 at 14:39
  • closer to 12 years later: how does this work with strings (as in string reversal)? Is it because you're not operating on ASCII values but rather on the binary representation of the memory addresses holding various parts of the string? – bluppfisk Sep 02 '19 at 21:55
  • 3
    I can barely resist the urge to change `y2` to `y1`. It triggers me that you have `x0` and `x1` but then use `y0` and `y2`. :-] – Frerich Raabe Dec 13 '19 at 12:52
101

Other people have explained it, now I want to explain why it was a good idea, BUT now isn't.

Back in the day when we had simple single cycle or multi-cycle CPUs, it was cheaper to use this trick to avoid costly memory dereferences or spilling registers to the stack. However, we now have CPUs with massive pipelines instead. The P4's pipeline ranged from having 20 to 31 (or so) stages in their pipelines, where any dependence between reading and writing to a register could cause the whole thing to stall. The xor swap has some very heavy dependencies between A and B that don't actually matter at all but stall the pipeline in practice. A stalled pipeline causes a slow code path, and if this swap's in your inner loop, you're going to be moving very slowly.

In general practice, your compiler can figure out what you really want to do when you do a swap with a temp variable and can compile it to a single XCHG instruction. Using the xor swap makes it much harder for the compiler to guess your intent and therefore much less likely to optimize it correctly. Not to mention code maintenance, etc.

Maifee Ul Asad
  • 3,992
  • 6
  • 38
  • 86
Patrick
  • 90,362
  • 11
  • 51
  • 61
  • Yep - like all memory-saving tricks, this isn't so useful in these days of cheap memory. – Bruce Alderman Nov 21 '08 at 21:43
  • 1
    By the same token, however, embedded system cpus still benefit quite a lot. – Paul Nathan Feb 09 '09 at 15:38
  • 1
    @Paul, it'd depend on your tool chain. I'd test it first to be certain that your embedded compiler isn't already performing that optimization. – Patrick Feb 21 '09 at 07:09
  • 3
    (It's also worth noting that from a size perspective, three XORs is likely larger than one XCHG, depending on the architecture. You may save more space by not using the xor trick.) – Patrick Feb 21 '09 at 07:11
59

I like to think of it graphically rather than numerically.

Let's say you start with x = 11 and y = 5 In binary (and I'm going to use a hypothetical 4 bit machine), here's x and y

       x: |1|0|1|1|   -> 8 + 2 + 1
       y: |0|1|0|1|   -> 4 + 1

Now to me, XOR is an invert operation and doing it twice is a mirror:

     x^y: |1|1|1|0|
 (x^y)^y: |1|0|1|1|   <- ooh!  Check it out - x came back
 (x^y)^x: |0|1|0|1|   <- ooh!  y came back too!
plinth
  • 48,267
  • 11
  • 78
  • 120
  • Very clear. Following each XOR operation on every bit makes it much easier to understand what's going on. I think it's more difficult to understand XOR because unlike & and | operations, it's a lot more difficult to do in your head. XOR arithmetic just leads to confusion. Don't be afraid to visualise the problem.The compiler is there to do the math, not you. – Martyn Shutt Jan 20 '15 at 11:39
36

Here's one that should be slightly easier to grok:

int x = 10, y = 7;

y = x + y; //x = 10, y = 17
x = y - x; //x = 7, y = 17
y = y - x; //x = 7, y = 10

Now, one can understand the XOR trick a little more easily by understanding that ^ can be thought of as + or -. Just as:

x + y - ((x + y) - x) == x 

, so:

x ^ y ^ ((x ^ y) ^ x) == x
Matt J
  • 43,589
  • 7
  • 49
  • 57
  • @Matt J, thanks for the subtraction example. It did help me grok it. – mmcdole Oct 30 '08 at 14:17
  • 3
    Might be worth emphasising that you can't use the addition or subtraction methods because of overflows with large numbers. – MarkJ Feb 11 '09 at 21:16
  • Is that the case? In the small examples I worked out, things worked out OK regardless (assuming the result of an underflow or overflow is (result % 2^n)). I might code something up to test it out. – Matt J Feb 12 '09 at 02:32
  • I think that, assuming the most parsimonious hardware implementation of the ADD and SUB instructions, this works properly even in the presence of overflow or underflow. I've just tested it. Am I missing something? – Matt J Feb 12 '09 at 02:59
  • I suppose if you don't have exceptions for overflow & underflow it would work, sure. – MarkJ Feb 13 '09 at 21:01
15

The reason WHY it works is because XOR doesn't lose information. You could do the same thing with ordinary addition and subtraction if you could ignore overflow. For example, if the variable pair A,B originally contains the values 1,2, you could swap them like this:

 // A,B  = 1,2
A = A+B // 3,2
B = A-B // 3,1
A = A-B // 2,1

BTW there's an old trick for encoding a 2-way linked list in a single "pointer". Suppose you have a list of memory blocks at addresses A, B, and C. The first word in each block is , respectively:

 // first word of each block is sum of addresses of prior and next block
 0 + &B   // first word of block A
&A + &C   // first word of block B
&B + 0    // first word of block C

If you have access to block A, it gives you the address of B. To get to C, you take the "pointer" in B and subtract A, and so on. It works just as well backwards. To run along the list, you need to keep pointers to two consecutive blocks. Of course you would use XOR in place of addition/subtration, so you wouldn't have to worry about overflow.

You could extend this to a "linked web" if you wanted to have some fun.

Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
14

Most people would swap two variables x and y using a temporary variable, like this:

tmp = x
x = y
y = tmp

Here’s a neat programming trick to swap two values without needing a temp:

x = x xor y
y = x xor y
x = x xor y

More details in Swap two variables using XOR

On line 1 we combine x and y (using XOR) to get this “hybrid” and we store it back in x. XOR is a great way to save information, because you can remove it by doing an XOR again.

On line 2. We XOR the hybrid with y, which cancels out all the y information, leaving us only with x. We save this result back into y, so now they have swapped.

On the last line, x still has the hybrid value. We XOR it yet again with y (now with x’s original value) to remove all traces of x out of the hybrid. This leaves us with y, and the swap is complete!


The computer actually has an implicit “temp” variable that stores intermediate results before writing them back to a register. For example, if you add 3 to a register (in machine-language pseudocode):

ADD 3 A // add 3 to register A

The ALU (Arithmetic Logic Unit) is actually what executes the instruction 3+A. It takes the inputs (3,A) and creates a result (3 + A), which the CPU then stores back into A’s original register. So, we used the ALU as temporary scratch space before we had the final answer.

We take the ALU’s implicit temporary data for granted, but it’s always there. In a similar way, the ALU can return the intermediate result of the XOR in the case of x = x xor y, at which point the CPU stores it into x’s original register.

Because we aren’t used to thinking about the poor, neglected ALU, the XOR swap seems magical because it doesn’t have an explicit temporary variable. Some machines have a 1-step exchange XCHG instruction to swap two registers.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 4
    I understand that, I'm asking how it works. How does using an exclusive or on a value allow you to swap it without a temp variable – mmcdole Oct 30 '08 at 06:41
  • Upvoted because this is the clearest and most detailed answer, but want to note that the swap with a temp variable is a lot more readable and by virtue of that carries more value in code – eyelidlessness Oct 30 '08 at 06:59
  • 1
    I liked the original answer (with explanation), but the bit about the ALU seems misguided. Even on the single-cycle (non-pipelined) processors you allude to, the ability to do "x = (op involving x)" in 1 instruction has more to do with the fact that the register file has input *and* output ports. – Matt J Oct 30 '08 at 07:38
7

@VonC has it right, it's a neat mathematical trick. Imagine 4 bit words and see if this helps.

word1 ^= word2;
word2 ^= word1;
word1 ^= word2;


word1    word2
0101     1111
after 1st xor
1010     1111
after 2nd xor
1010     0101
after 3rd xor
1111     0101
Community
  • 1
  • 1
kenny
  • 21,522
  • 8
  • 49
  • 87
5

Basically there are 3 steps in the XOR approach:

a’ = a XOR b (1)
b’ = a’ XOR b (2)
a” = a’ XOR b’ (3)

To understand why this works first note that:

  1. XOR will produce a 1 only if exactly one of it’s operands is 1, and the other is zero;
  2. XOR is commutative so a XOR b = b XOR a;
  3. XOR is associative so (a XOR b) XOR c = a XOR (b XOR c); and
  4. a XOR a = 0 (this should be obvious from the definition in 1 above)

After Step (1), the binary representation of a will have 1-bits only in the bit positions where a and b have opposing bits. That is either (ak=1, bk=0) or (ak=0, bk=1). Now when we do the substitution in Step (2) we get:

b’ = (a XOR b) XOR b
= a XOR (b XOR b) because XOR is associative
= a XOR 0 because of [4] above
= a due to definition of XOR (see 1 above)

Now we can substitute into Step (3):

a” = (a XOR b) XOR a
= (b XOR a) XOR a because XOR is commutative
= b XOR (a XOR a) because XOR is associative
= b XOR 0 because of [4] above
= b due to definition of XOR (see 1 above)

More detailed information here: Necessary and Sufficient

3

As a side note I reinvented this wheel independently several years ago in the form of swapping integers by doing:

a = a + b
b = a - b ( = a + b - b once expanded)
a = a - b ( = a + b - a once expanded).

(This is mentioned above in a difficult to read way),

The exact same reasoning applies to xor swaps: a ^ b ^ b = a and a ^ b ^ a = a. Since xor is commutative, x ^ x = 0 and x ^ 0 = x, this is quite easy to see since

= a ^ b ^ b
= a ^ 0
= a

and

= a ^ b ^ a 
= a ^ a ^ b 
= 0 ^ b 
= b

Hope this helps. This explanation has already been given... but not very clearly imo.

jheriko
  • 3,043
  • 1
  • 21
  • 28
  • 1
    Waay late here, but signed integer overflow is undefined behavior in C and (older versions of) C++. Potentially invoking UB just to "save some space" when swapping variables is a really bad idea. – Andrew Henle Nov 21 '20 at 17:50
3

I just want to add a mathematical explanation to make the answer more complete. In group theory, XOR is an abelian group, also called a commutative group. It means it satisfies five requirements: Closure, Associativity, Identity element, Inverse element, Commutativity.

XOR swap formula:

a = a XOR b
b = a XOR b
a = a XOR b 

Expand the formula, substitute a, b with previous formula:

a = a XOR b
b = a XOR b = (a XOR b) XOR b
a = a XOR b = (a XOR b) XOR (a XOR b) XOR b

Commutativity means "a XOR b" equal to "b XOR a":

a = a XOR b
b = a XOR b = (a XOR b) XOR b
a = a XOR b = (a XOR b) XOR (a XOR b) XOR b 
            = (b XOR a) XOR (a XOR b) XOR b

Associativity means "(a XOR b) XOR c" equal to "a XOR (b XOR c)":

a = a XOR b
b = a XOR b = (a XOR b) XOR b 
            = a XOR (b XOR b)
a = a XOR b = (a XOR b) XOR (a XOR b) XOR b 
            = (b XOR a) XOR (a XOR b) XOR b 
            = b XOR (a XOR a) XOR (b XOR b)

The inverse element in XOR is itself, it means that any value XOR with itself gives zero:

a = a XOR b
b = a XOR b = (a XOR b) XOR b 
            = a XOR (b XOR b) 
            = a XOR 0
a = a XOR b = (a XOR b) XOR (a XOR b) XOR b 
            = (b XOR a) XOR (a XOR b) XOR b 
            = b XOR (a XOR a) XOR (b XOR b) 
            = b XOR 0 XOR 0

The identity element in XOR is zero, it means that any value XOR with zero is left unchanged:

a = a XOR b
b = a XOR b = (a XOR b) XOR b 
            = a XOR (b XOR b) 
            = a XOR 0 
            = a
a = a XOR b = (a XOR b) XOR (a XOR b) XOR b 
            = (b XOR a) XOR (a XOR b) XOR b 
            = b XOR (a XOR a) XOR (b XOR b) 
            = b XOR 0 XOR 0 
            = b XOR 0
            = b

And you can get further information in group theory.

0

Others have posted explanations but I think it would be better understood if its accompanied with a good example.

XOR Truth Table

If we consider the above truth table and take the values A = 1100 and B = 0101 we are able to swap the values as such:

A = 1100
B = 0101


A ^= B;     => A = 1100 XOR 0101
(A = 1001)

B ^= A;     => B = 0101 XOR 1001
(B = 1100)

A ^= B;     => A = 1001 XOR 1100
(A = 0101)


A = 0101
B = 1100
LifelessG
  • 21
  • 4
0

I know the question has been asked for quite some time and there are many answers to it. However, I have my own "linguistic" trick for understanding bitwise magic. Perhaps for someone it will be useful.

XOR will only produce 1 if exactly one of its operands is 1 and the other is zero. Therefore XOR is the difference between two numbers.

  1. With *x ^= *y; we store this difference in x.
  2. If we know the second number and the difference between x and y, then we can find out the first number with *y ^= *x;
  3. If we know the difference between x and y, and we know the first number, we can find out the second number with *x ^= *y;

I'm having trouble translating into English, so I'll add a graphical representation:

x = First number;
y = Second number;

After *x ^= *y
x = Difference between first and second number
y = Second number

After *y ^= *x
x = Difference between first and second number
y = First Number

After *x ^= *y
x = Second number
y = First number

I use this so I don't have to try to imagine a lot of numbers in my head. That is why there is no numerical representation in my answer.