Understanding binary addition in python using bit manipulation

Question

I am looking at solutions for this question:

Given two integers a and b, return the sum of the two integers without using the operators + and -. (Input Limits: -1000 <= a, b <= 1000)

In all these solutions, I am struggling to understand why the solutions do ~(a ^ mask) when a exceeds 32-bit number max 0x7fffffff when evaluating a + b [see code below].

def getSum(self, a: int, b: int) -> int:
    
    # 32bit mask
    mask = 0xFFFFFFFF # 8Fs = all 1s for 32 bits

    while True: 
        # Handle addition and carry 
        a, b = (a ^ b) & mask, ((a & b) << 1) & mask
        if b == 0:
            break

    max_int = 0x7FFFFFFF

    print("A:", a)
    print("Bin A:", bin(a))
    print("Bin M:", bin(mask))
    print("  A^M:", bin(a ^ mask))
    print("~ A^M:", bin(~(a ^ mask)))
    print("  ~ A:", bin(~a))

    return a if a < max_int else ~(a ^ mask)

I don't get why we need to mask a again when returning answer?

When exiting the loop it was already masked: a = (a ^ b) & mask. So why can't we just do ~a if the 32nd bit is set to 1 for a?

I looked at The meaning of Bit-wise NOT in Python, to understand ~ operation, but did not get it.

Output for a = -12, b = -8. Correctly returns -20:

A: 4294967276
Bin A: 0b11111111111111111111111111101100
Bin M: 0b11111111111111111111111111111111
  A^M: 0b10011
~ A^M: -0b10100
  ~ A: -0b11111111111111111111111111101101

@TimRoberts do you say `a` is an unsigned value because we do mask at each iteration which drops the sign bit? Since it will be one of the infinite 1s? — Parth, Feb 18 '23 at 20:44
I know there are some other options, but I want to use this question as a vehicle to understand bitwise operations and how the adder works a little better. — Parth, Feb 19 '23 at 00:45
`a` is technically not unsigned. All integers in Python are signed, but it is a positive value because you never do anything to make it negative. If you are working in a language with 32-bit integers, then setting bit 31 makes it negative. That doesn't happen in Python. I can add as many bits as I want, and it will stay a positive number. We can't SEE the sign bit, we can't SET the sign bit. — Tim Roberts, Feb 19 '23 at 04:54
So how does python know the difference between -2 and 2? @TimRoberts A 2s complement can overlap with a positive number, right? Without some sign bit, how does any know that a number is a 2s complement and needs to be made negative and not just a positive number? We can have overflows in both negative and positive direction right? — Parth, Feb 19 '23 at 04:56
"All integers in Python are signed, but it is a positive value because you never do anything to make it negative. " I do not understand this part :/ — Parth, Feb 19 '23 at 04:58
I don't actually know the internal representation of Python's integers. I assume they are stored as a variable-length byte array with the positive value and a separate sign bit, so "2" and "-2" would have the same mantissa with a different sign bit. And no, you cannot have integer overflows in Python. The value just gets bigger and bigger until you run out of memory, but adding two positive numbers will always have a positive result. That's not true with fixed-length registers. — Tim Roberts, Feb 19 '23 at 05:08
"I do not understand this part" -- ANDs, ORs, and XORs do not change the sign bit. Unless you multiply (or divide) by -1 or do subtraction from a smaller value, the value will remain positive. Arbitrary precision integers are tricky. — Tim Roberts, Feb 19 '23 at 05:10
@TimRoberts *"ANDs, ORs, and XORs do not change the sign bit"* - What do you mean? Even this code uses AND to change negative numbers to non-negative ones. — Kelly Bundy, Feb 19 '23 at 08:48
As I see it, `~(a ^ mask)` gives us a digit that has the value of `a`, followed by all MSBs after the 32nd bit set to 1. So I still don't get how Python interprets that (`0b1111111111101100` with infinite leading 1s) as `-0b10100`? — Parth, Feb 19 '23 at 15:51
Look at your example. `a` is `11111111111111111111111111101100`. That's a positive integer that happens to require 32 bits. `a ^ mask` inverts those 32 bits, giving us `10011`. The NOT operator inverts an "infinite" number of bits, giving us `11111...11101100`. That's the original value with 1s extended forever. THAT is now a negative number, which can also be represented as `-10100`. — Tim Roberts, Feb 19 '23 at 18:57
Okay 2 questions: 1. Why is `11111...11101100` a negative number and same as `-10100`. It does not look likes 2s complement of 20 so what representation is this? 2. What if `11111111111111111111111111101100` was actually a positive number that overflowed (because we had big positive a and b)? Then converting it to negative would not be correct right? How can we be sure that if a number overflowed and took 32 bits then it is a negative number? — Parth, Feb 19 '23 at 19:41
I think question 2 is answered by the limits on `a` and `b`. Both of them are between [-1000, 1000]. So there should be no overflow from adding 2 positive numbers. The only overflow will come from the leading 1s of a negative number. Question 1 still stands. I am confused about this representation of -20. — Parth, Feb 20 '23 at 04:40
No, you're confused by the PRESENTATION. -20 is represented as 1111111...101100. It can be printed in many different ways: decimal, hex, octal, binary (-20, -14, -24, -10100). — Tim Roberts, Feb 20 '23 at 04:54

relent95 · Answer 1 · 2023-02-19T04:03:53.977

2

You forgot to specify constraints of the original problem, that the a and b are in [-1000, 1000]. That code is a port of a C or Java implementation, which assumes a and b are 4 byte signed integers. And I'm not sure you understand the (a ^ b) and (a & b) << 1 code correctly. The former adds i-th bit of the a and b ignoring a carry bit for every bit. The latter gets all that ignored carry bits.

To the main point, the last operation ~(a ^ mask) is for dealing with a negative integer. The a ^ mask inverts the lower 32 bits of the a and preserves the other upper bits. So, the bitwise NOT of the result, ~(a ^ mask) preserves the lower 32 bits of the a and inverts the other upper bits. Because the upper bits(other than the lower 32 bits) of the a are all zero, the ~(a ^ mask) is just setting all the upper bits to one. It's equivalent to a | ~mask, which is much more readable.

See the following example.

print(~(0xffffffff ^ 0xffffffff)) # This will output -1.
print(~(0xfffffffe ^ 0xffffffff)) # This will output -2.
print(~0xffffffff) # This will output -4294967296(-0x100000000).

edited Feb 19 '23 at 04:03

answered Feb 19 '23 at 02:59

relent95

3,703
1
14
17

But a was already 32 bits long (we do a & mask). So `^` is just flipping the bits right? Also as a whole is it calculating 2's compliment? – Parth Feb 19 '23 at 03:24
also, why is `a | ~mask` more readable? – Parth Feb 19 '23 at 03:26
Yes, you are right for the first question, since the upper bits(other than the lower 32 bits) of ```a ^ mask``` are all zero. And for the second question, by "as a whole", you mean the ```~(a ^ mask)```? Then no it's not calculating a two's complement. It converts a positive Python integer representing a negative integer as a two's complement(32bit signed integer) into a negative Python integer. And finally, the ```a | ~mask``` is more readable if you think all the variables represented as lists of bits. – relent95 Feb 19 '23 at 04:29
So do you mean `a ^ mask` gives a 2s complement? And then `~(a^ mask)`, converts 2s complement to signed number? How/Why exactly does that work? Qualitatively I kind of get that ~ will flip all remaining 0s after the first 32 bits to 1 (and give us back the first 32 bits since we XORed them before). But does that also flip the sign bit so Python now knows this is a negative number? – Parth Feb 19 '23 at 04:51
If possible could you show me an example of writing out 1s (say for 8bits) of how this work? Or is there any place I can read this with examples that write out numbers in 1s and 0s and show the working? I still do not get why `a | ~mask` is more readable (someone else mentioned this in a comment on leet code too) – Parth Feb 19 '23 at 04:54
`-0b10100` and `0b1111111111101100` are the SAME THING. If you had 32-bit registers, they would have the exact same bit pattern. Both represent the value -24. You are confusing yourself by viewing that negative number in binary. That representation would never be found in a computer register. We use `~(a ^ mask)` construct to convert a 32-bit pattern to its two's complement integer value. – Tim Roberts Feb 19 '23 at 05:01
As I see it, `~(a ^ mask)` gives us a digit that has the value of `a`, followed by all MSBs after the 32nd bit set to 1. So I still don't get how Python interprets that (`0b1111111111101100` with infinite leading 1s) as `-0b10100`? – Parth Feb 19 '23 at 15:52
Python(CPython) manipulates the sign and absolute value of a multi-precision integer. The ```~``` operation changes the sign. See [this](https://github.com/python/cpython/blob/v3.11.2/Objects/longobject.c#L4659) for detail. – relent95 Feb 20 '23 at 00:59
@relent95 I took a look at the code but didn't fully get it. Regardless I understand your example better now. I see that it preserves the 2s complement which is why `0xfffffffe` becomes -2. But can you tell me why originally `-20` gets printed as `0b11111111111111111111111111101100`? Should it not have been `0b111111111111111111111111111111100` [correct 2s complement for -20]? – Parth Feb 20 '23 at 04:46
Try ```bin(-20 & mask)```, which will be '0b11111111111111111111111111101100'. And it's nonsense to say two's complement of a negative integer. Also, two's complement of the 32 bit integer 20 is 0b11111111111111111111111111101100. It seems you are misunderstanding a two's complement. See [this](https://en.wikipedia.org/wiki/Two%27s_complement#Theory). – relent95 Feb 20 '23 at 05:13
Oh yes! Thanks for pointing that out, I was calculating 2s complement incorrectly. So basically the `~` tells the compiler that this is not an unsigned positive number but a signed negative number and its 2s complement is stored in `a`? – Parth Feb 20 '23 at 15:57
I have added a new section above in my question explaining my understanding a bit more. Let me know if it sounds correct to you. @relent95 – Parth Feb 20 '23 at 16:03
@Parth, yes, now you seem to understand correctly. A minor one to point out is that Python interpreter is not a compiler. Also instead of adding your understanding to the question, you can answer your own question. – relent95 Feb 21 '23 at 00:43

score 0 · Answer 2 · answered Feb 21 '23 at 15:25

Here's a helpful link - Differentiating a 2's complement negative number and the corresponding positive number.

Because of the bounds, we will never have a positive number overflow. (numbers are only between [-1000, 1000]). So we know that when the 32nd bit was set it was because it was a negative number.

Note that if our range allowed positive number addition to overflow too, then there would be no way to distinguish between positive number setting 32nd bit and negative number setting it.

So we do ~(a ^ mask) to retain bits of a till the first 32 bits and add infinite leading 1s to it so Python knows this is a negative number.

If you look at Bin A, it's already the 2s complement representation of -20. But right now Python treats it as a positive number. So we need to tell Python to treat it as a negative number using ~(a ^ mask).

Understanding binary addition in python using bit manipulation

2 Answers2