56

I had been studying the algorithm for finding lonely integers in an array, and here is the implementation:

int arr[] = {10, 20, 30, 5, 20, 10, 30};
int LonelyInteger = 0;
for(int i=0; i< 7; i++)
{
    LonelyInteger = LonelyInteger ^ arr[i];
}

The result is 5.

My question is - supposedly the integers (getting generated by the XOR operation) are too large due to this operation:

LonelyInteger ^ arr[i]

Which leads to a potentially large integer which cannot be represented by the datatype say int in this case. My questions are:

  1. Is it even possible that XOR will generate such a large integer value that cannot be stored in the int type?
  2. If it is not possible that this can happen then is there a proof for this?
phuclv
  • 37,963
  • 15
  • 156
  • 475
Expert Novice
  • 1,943
  • 4
  • 22
  • 47
  • 35
    XOR is a bitwise operation. It does not generate integers, it generates bit patterns. The result is logically either a trap representation, or a non-trap representation of a value. It's not possible to get a "too large" result because the "too large" result would not be representable. – davmac Feb 04 '15 at 12:12
  • 1
    @YvesDaoust, why do you think it could _not_ possibly lead to a large integer? Must the result of XOR always be a _small_ integer? Perhaps you misread 'large' as 'larger'? – davmac Feb 04 '15 at 12:14
  • 1
    @davmac I think it would be reasonable to assume that OP meant *larger* specifically. One might assume that he's expanding on the previous sentence which is a question about *too large* integer. The result can only be too large if it is *larger* than the operands which are *not too large* by definition (in the context of overflow). – eerorika Feb 04 '15 at 12:27
  • 1
    @user2079303 I completely disagree that it is reasonable to make such an assumption, especially given the context and the repeated "large integer" in the following question (labelled 1), which expounds to "such a large integer that cannot be stored in the `int` type?". That such an integer must be larger than the operands is your own intuition/knowledge, it is not being stated by the OP. In any case XOR can certainly result in a _larger_ value than either operand, so why the "how ???" question? – davmac Feb 04 '15 at 12:40
  • 11
    `x^y` can be bigger than `max(x, y)`, so in that respect you can get a "large integer" for some definition of large. – harold Feb 04 '15 at 12:41
  • 1
    This question is rather unexpected, given that the OP knows that XOR can generate "large" numbers, without knowing why nor how. –  Feb 04 '15 at 12:45
  • I have one more question, if the array is `int arr[] = {10, 20, 30 ,5, 6, 20, 10,30 };`. Will this logic work? Because it did not work for me – Rasmi Ranjan Nayak Feb 04 '15 at 14:24
  • 2
    @RasmiRanjanNayak: No, this algorithm will not "find" lonely integers. It's only proposition is `LonelyInteger != 0 → there is a lonely value` (notice it's not a `↔`, as can be seen from the example `{1,2,3}`) – Bergi Feb 04 '15 at 15:59
  • 3
    @Bergi: If there's only one lonely integer, it will result in that value. If there's multiples, then you're right that the only information is if it's !=0 – Mooing Duck Feb 04 '15 at 18:17
  • 5
    @harold `x^y <= x+y < 2*max(x,y)` , so you cannot set any bit higher than was already set in either operand. Example: 128^127 = 255, which is still an 8-bit quantity. – smci Feb 05 '15 at 08:21

10 Answers10

121

XOR will never go out of bounds because it combines bits and doesn't create new bits where no bits were set before.

The result 5 is correct. Look at the binary representation of your value and the XOR result

10    00001010
20    00010100
30    00011110
 5    00000101
20    00010100
10    00001010
30    00011110
--------------
      00000101 => 5

An easy help for calculating a result of many XORed values is: The result will have a bit set where an odd number of bits are combined, no bit set for even number of bits.

If it is not possible that this can happen then is there a proof for this?

XOR is equivalent to addition without carry on the individual bits. When you add bits without carry, no overflow can happen and so the int value can't go out of bounds.

schnaader
  • 49,103
  • 10
  • 104
  • 136
  • 2
    But do we have sufficient guarantees that `int` is binary-like? That the max int happens at a limit of `2^n-1` for some `n`? – Yakk - Adam Nevraumont Feb 06 '15 at 16:17
  • This is about bitwise XOR, so of course it's binary like. If it wasn't we'd have to define a non-bitwise xor (not binary). – assembly_wizard Feb 10 '15 at 22:02
  • @ThatWeirdo binarity of XOR was never disputed here, but the binarity of `int`s. What if they happen to be stored in 9-trit register of a ternary computer, and the maximum value is not one less than a power of two? – John Dvorak Feb 11 '15 at 00:33
  • @JanDvorak : Where did you see a ternary computer? Or, is your question just theoretical? – Bhaskar Feb 11 '15 at 10:05
  • @L16H7 I was mostly thinking about [3niti](http://ternary.me/wiki/index.php?n=Main.HomePage), but [there have been a few (two or three notable) hardware implementations](http://en.wikipedia.org/wiki/Ternary_computer) as well. – John Dvorak Feb 11 '15 at 10:44
  • 1
    @JanDvorak : What's the definition (or truth table) of XOR operator for ternary system? I think even C and C++ don't know. If ternary computer is designed, I think a new programming language have to be created from scratch. – Bhaskar Feb 12 '15 at 10:24
  • can unsigned xor unsigned --> negative? – Dee Jun 15 '21 at 16:13
38

The result can never be "too large" in the sense of its representation requiring more bits than int provides, since the operation is defined to combine bit values of its operands, not produce any new bits. Perhaps a better question might be, can the result be something other than a valid value representation of an int?

For unsigned integers, no. All bit patterns, and hence the result of all bitwise operations, are valid value representations.

For signed integers, it depends on the implementation-defined representation of negative values. Every implementation you're likely to encounter uses 2's-complement, in which again every bit pattern is valid; so again, the result of any bitwise operation will be a valid representation.

However, the standard also allows other representations, in which there may be one or more invalid bit patterns. In that case, it's possible for a bitwise operation, with two valid operands, to produce that pattern, and hence produce an invalid result.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • Will this invalid result cause undefined behavior? – Ruslan Feb 04 '15 at 18:59
  • 1
    @Ruslan: For C++, I'd say yes; the standard doesn't say much about exactly which signed values are valid, and what happens in the case of invalid ones, so the behaviour doesn't seem to be defined. I can't speak for C. – Mike Seymour Feb 05 '15 at 11:42
28

(This post applies to C, not C++)

The bitwise operators cannot cause a trap representation due to setting invalid padding bits, see C11 6.2.6.2/1 footnote:

...no arithmetic operation on valid values can generate a trap representation...

(The meaning of "arithmetic operation" is unclear but the index links to 6.5.11 which is the definition of XOR).

However, in C they can cause a negative zero to be generated. In 2's complement there is no negative zero. But say you were on a system with 1's complement then you could generate negative zero via ^ and this might cause a trap representation. 6.2.6.2/3 explicitly says that this is possible:

If the implementation supports negative zeros, they shall be generated only by:

— the &, |, ^, ~, <<, and >> operators with operands that produce such a value;

Finally 6.2.6.2/2 implies (I'm pretty sure anyway) that it's not possible to have any combination of value bits that would represent an integer exceeding INT_MAX


To summarise, the possible results of ^ on two ints are:

  • Another valid int value (perhaps with different but non-trapping padding bits to other versions of the same value)
  • A negative zero, which may or may not cause a trap
Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365
  • In C++ it doesn't actually say what happens when negative zero is generated, so far as I can see – M.M Feb 04 '15 at 12:22
  • 1
    nit: The C standard allows all bits zero but the sign bit to trap for 2's complement (that is, `INT_MIN` may be `-INT_MAX` and `INT_MAX ^ -1` undefined). While I know of examples of non-2's complement machines, I don't know if such a 2's complement implementation exists, though (maybe it's just in the standard because it usually isn't a burden for the programmer). – mafso Feb 04 '15 at 14:06
  • 2
    @mafso: I believe some two's-complement implementations define INT_MIN to be -32767 so as to avoid any obligation to handle some corner cases with things like printf, division, etc. For example, on a machine which only has a signed right-shift operation, evaluation of n/4 when n is negative might compute -(-n)>>2. If n=-32767, that would yield -8191, but if n=-32768 it would yield 8192. If INT_MIN is -32767, computation of (-32768)/4 would be Undefined Behavior, so having it yield 8192 would be perfectly legitimate. – supercat Feb 04 '15 at 18:37
  • 2
    @mafso: Additionally, while I don't know of any hardware that regards -INT_MAX-1 as either a trap or a NaN, there are certainly times when a NaN would be useful (being able to test after the fact if overflow occurred at any stage in a computation could likely be more efficient than having to have code check for and trap overflows at each stage, especially on systems that support out-of-order execution). I'm not sure such hardware could get past the chicken-and-egg problem, but I wouldn't mind having such a thing available. – supercat Feb 04 '15 at 18:44
  • 1
    The next words after your quote are "other than as part of an exceptional condition such as an overflow, and this cannot occur with unsigned types." - generating a value with the sign bit 1 and all other bits 0 (i.e. `-INT_MAX-1` on a twos-complement system), on a system which defines this as a trap representation, could definitely be regarded as an overflow. – Random832 Feb 05 '15 at 18:33
  • @Random832 "overflow" normally implies a value being out of range, and the bitwise operations don't work with values, although perhaps "overflow" could also describe this situation as you are suggesting. Maybe we need another question about what exactly "overflow" means. – M.M Feb 05 '15 at 22:29
21

Strictly speaking, you can't XOR two integers. You can XOR two integer-sized bags of bits, and you can treat those bags of bits as integers at other times. You can even treat them as integers at all other times.

But at the moment you perform the XOR operation, you're treating them as something quite different from integers, or even numbers, per se: they're just two sequences of bits, where corresponding bits get compared. The concept of overflow doesn't apply to that, and so if you then decide to treat the result as an integer, it cannot overflow either.

The Spooniest
  • 2,863
  • 14
  • 14
  • 5
    I can't really endorse this comment: the C Standard directly says that you can XOR two integers . – M.M Feb 05 '15 at 12:13
  • C lets you do something that looks like XOR'ing two integers, using the bitwise XOR operation as specified in section 6.5.12, but that operation still treats the operands like bags of bits. Some "logical operators" do exist that operate on the values of integers (specifically, AND, OR, and NOT; C renders these as &&, ||, and !). But there is no logical operator for XOR. – The Spooniest Feb 05 '15 at 15:52
  • I can treat integers as infinitely large bags of bits, XOR those and always end up with a valid integer... In fact, I can also do that for any binary fraction. Non-terminating fractions do cause 0.9999... problems, though. – John Dvorak Feb 11 '15 at 00:38
  • @JanDvorak: If you use some kind of variable-length encoding for integers, then yes, you can do this. You can also do it for fractions, or, in fact, for anything else, as long as you have some way to map them to bags of bits and vice versa. But the XOR operation still treats them as bags of bits, not as the variable-length numbers or fractions or whatever other structure you may have initially used. – The Spooniest Feb 11 '15 at 13:43
  • @TheSpooniest I'm not using a variable-length encoding. I'm using an infinite-length encoding, which there is a natural one for integers. As for a definition without using binary, there's the mex operation = minimal excluded value: f(x,y) is the least non-negative integer that is different from every f(x',y) for x' – John Dvorak Feb 11 '15 at 14:12
  • @JanDvorak: I assume that you're talking about writing down integers the way that we do in real life, but that's not infinite-length unless you write out every integer with an infinite number of leading zeroes: in normal usage it is variable-length. Either way, the XOR operation still treats the two integers as bags of bits. The mex operator that you mention is interesting, but it doesn't apply to integers in general: it only works properly on the Grundy numbers. – The Spooniest Feb 11 '15 at 14:27
11

Is it even possible that XOR will generate such a large integer value that cannot be stored in the int type?

If the operands are int, then no.

If it is not possible that this can happen then is there a proof for this?

Well, it's trivial from the definition. This is hardly a mathematically rigorous proof, but you could consider that a bit in the output of XOR will only be 1 if one of the operands has 1 in that position. Since an out of range bit cannot be 1 in the operands, there is not output bit with value 1 that is out of range.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • `Only if the operands are negative.` How? – Expert Novice Feb 04 '15 at 12:13
  • @ExpertNovice Because, (assuming 2's complement), the most significant bit of negative numbers is `1`. And it is `0` for positive numbers. If both operands are negative, then it follows that both of their most significant bit is `1`. `1 XOR 1` is `0`. Therefore the most significant bit of XOR of two negative numbers is `0` i.e. the integer is positive. Positive numbers are larger than negative. – eerorika Feb 04 '15 at 12:16
  • I take that back about only negative operands having larger result. XOR of positive numbers can be larger than operands as well. But not larger than fits in the number of bits of operands. I got the implication backwards. The result is always larger if both operands are negative, but they're not the only case of larger result. My bad. – eerorika Feb 04 '15 at 12:49
  • right, `1 ^ 4` is `5` which is larger than either — but the *highest bit set* doesn't increase. And the ability to represent `4` implies the ability to represent `5`, `6`, and `7` as well (C99 6.2.6.2) – hobbs Feb 05 '15 at 19:00
11

XOR, AND, OR, NOT and any other bitwise operators produce bitwise results, with the bits in the result combined from the bits at exactly the same position in the inputs. So n-bit inputs produce n-bit without any higher bit, so how can it goes off bounds?

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 5
    C and C++ don't require that every possible sequence of n bits represent a valid value. On modern machines, they typically will, but some wacky architectures might not. – cHao Feb 04 '15 at 17:22
  • @cHao that only appears with trap representation. On 1 or 2's complement or sign-magnitude all bit patterns are valid – phuclv Feb 04 '15 at 17:59
  • @LưuVĩnhPhúc: I think a two's-complement machine may legitimately define INT_MIN to be -INT_MAX, freeing itself of any obligation to ensure that arithmetic involving -INT_MIN-1 behaves sensibly. – supercat Feb 04 '15 at 18:47
10

No it cannot. Unlike others answers mine would be mathematical proof.

XOR is shortcut for exclusive or or exclusive disjunction () and can be defined as:

A ⊕ B = (A ∪ B)\(A ∩ B)

Your proposition is that

∃x: x ∉ A ∧ x ∉ B ∧ x ∈ (A ⊕ B)

So from first equation

x ∈ (A ∪ B)\(A ∩ B)

What can be expressed as

x ∈ (A ∪ B) ∧ x ∉ (A ∩ B)

The second part can be expressed as:

x ∉ A ∧ x ∉ B

The first part can be expressed as:

x ∈ A ∨ x ∈ B

What collide with our assumption that x ∉ A ∧ x ∉ B so proposition is false for any set A and B.

Q.E.D.

Hauleth
  • 22,873
  • 4
  • 61
  • 112
7

In a GENERAL CASE the described algorithm cannot really find a lonely integer in an array. What it really finds is XOR of all elements that occur odd number times there.

So, if there is just one 'lonely' element there, say an element 'a', and all the other elements occur EVEN number times in the array, then it works 'as required' -> it finds this lonely element 'a'.

Why?

The algorithm carries out XOR of all the elements in the array (a ^ b ^ c ^ d ^ ...)

The XOR operation has the following properties:

1) a ^ a = 0 (non-equivalence)

2) a ^ 0 = a (neutrality of 0)

3) a ^ b = b ^ a (commutative property)

4) (a ^ b) ^ c = a ^ (b ^ c) (associative property)

Let's assume, for instance, an array with elements: {a, b, c, a, c, b, a, c}

(element 'a' - 3 times, element 'b' - twice, element 'c' - 3 times)

Then, according to the above mentioned XOR properties, the algorithm result

R = (((((((a ^ b) ^ c) ^ a) ^ c) ^ b) ^ a) ^ c)

can be rearranged as follows:

R = (a ^ b) ^ (c ^ a) ^ (c ^ b) ^ (a ^ c) =

= (a ^ a) ^ (b ^ b) ^ (c ^ c) ^ (a ^ c) =

= 0 ^ 0 ^ 0 ^ (a ^ c) = (a ^ c)

i.e.,

a) ...all elements that occur an EVEN number times result in zero

b) ...all elements that occur an ODD number times are XOR-ed and create the final result

XOR is a bit-wise operation, so it never can overflow, of course.

Eric Best
  • 171
  • 5
3

Suppose

int xor  = x^y;
Max value of int is x = 999999999;
Max value of Xor will come if y=0;
and Max Xor is 999999999;

which is in limit. :)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Arun Tyagi
  • 2,206
  • 5
  • 24
  • 37
  • Use the backticks for inline code. For example **Max value of \`int\` is \`x = 999999999;\`** becomes **Max value of `int` is `x = 999999999;`** – Pluto Feb 04 '15 at 16:47
  • 5
    This doesn't prove anything. – indiv Feb 05 '15 at 17:31
  • 2
    999,999,999 xor 73,741,824 is 1,073,741,823. However, the max value of int is unlikely to be 999,999,999. – Random832 Feb 05 '15 at 22:21
2

Is it even possible that XOR will generate such a large integer value that cannot be stored in the int type?

Data-Type3 = Data-Type1 operator Data-Type2

If it is not possible that this can happen then is there a proof for this?

We've Data-Type3 in case of Integers is the one out of Data-Type1 and Data-Type2 that has a bigger size, even in case of addition or multiplication.

SIZE(Data-Type3) = MAX(SIZE(Data-Type1), SIZE(Data-Type2))

So if Data-Type1 = Data-Type2 then that's the return-type too.

Short + Short   = Short
Short + Integer = Integer
Short + Long    = Long

Integer + Short   = Integer
Integer + Integer = Integer
Integer + Long    = Long

Long + Short   = Long
Long + Integer = Long
Long + Long    = Long

What can happen is an overflow, which can occur when an operation have a carry. In 2's complement, it's when the carry into the high order column doesn't equal the carry out of the high order column. read more

But XOR operation cannot overflow, because XOR operation does not generate a carry, since XOR is a bit-wise operation like NOT.

Khaled.K
  • 5,828
  • 1
  • 33
  • 51