Yes, you can solve it with XORs. This answer expands on Paulo Almeida's great comment.
The algorithm works as follows:
Since we know that the array contains every element in the range [1 .. n], we start by XORing every element in the array together and then XOR the result with every element in the range [1 .. n]. Because of the XOR properties, the unique elements cancel out and the result is the XOR of the duplicated elements (because the duplicate elements have been XORed 3 times in total, whereas all the others were XORed twice and canceled out). This is stored in xor_dups
.
Next, find a bit in xor_dups
that is a 1. Again, due to XOR's properties, a bit set to 1 in xor_dups
means that that bit is different in the binary representation of the duplicate numbers. Any bit that is a 1 can be picked for the next step, my implementation chooses the least significant. This is stored in diff_bit
.
Now, split the array elements into two groups: one group contains the numbers that have a 0 bit on the position of the 1-bit that we picked from xor_dups
. The other group contains the numbers that have a 1-bit instead. Since this bit is different in the numbers we're looking for, they can't both be in the same group. Furthermore, both occurrences of each number go to the same group.
So now we're almost done. Consider the group for the elements with the 0-bit. XOR them all together, then XOR the result with all the elements in the range [1..n] that have a 0-bit on that position, and the result is the duplicate number of that group (because there's only one number repeated inside each group, all the non-repeated numbers canceled out because each one was XORed twice except for the repeated number which was XORed three times).
Rinse, repeat: for the group with the 1-bit, XOR them all together, then XOR the result with all the elements in the range [1..n] that have a 1-bit on that position, and the result is the other duplicate number.
Here's an implementation in C:
#include <assert.h>
void find_two_repeating(int arr[], size_t arr_len, int *a, int *b) {
assert(arr_len > 3);
size_t n = arr_len-2;
int i;
int xor_dups = 0;
for (i = 0; i < arr_len; i++)
xor_dups ^= arr[i];
for (i = 1; i <= n; i++)
xor_dups ^= i;
int diff_bit = xor_dups & -xor_dups;
*a = 0;
*b = 0;
for (i = 0; i < arr_len; i++)
if (arr[i] & diff_bit)
*a ^= arr[i];
else
*b ^= arr[i];
for (i = 1; i <= n; i++)
if (i & diff_bit)
*a ^= i;
else
*b ^= i;
}
arr_len
is the total length of the array arr
(the value of n+2
), and the repeated entries are stored in *a
and *b
(these are so-called output parameters).