Need help understanding binary conversion in Python

Question

Or I guess binary in general. I'm obviously quite new to coding, so I'll appreciate any help here.

I just started learning about converting numbers into binary, specifically two's complement. The course presented the following code for converting:

num = 19

if num < 0:
    isNeg = True
    num = abs(num)
else:
    isNeg = False
result = ''
if num == 0:
    result = '0'
while num > 0:
    result = str(num % 2) + result
    num = num // 2
if isNeg:
    result = '-' + result

This raised a couple of questions with me and after doing some research (mostly here on Stack Overflow), I found myself more confused than I was before. Hoping somebody can break things down a bit more for me. Here are some of those questions:

I thought it was outright wrong that the code suggested just appending a - to the front of a binary number to show its negative counterpart. It looks like bin() does the same thing, but don't you have to flip the bits and add a 1 or something? Is there a reason for this other than making it easy to comprehend/read?
Was reading here and one of the answers in particular said that Python doesn't really work in two's complement, but something else that mimics it. The disconnect here for me is that Python shows me one thing but is storing the numbers a different way. Again, is this just for ease of use? Is bin() using two's complement or Python's method?
Follow-up to that one, how does the 'sign-magnitude' format mentioned in the above answer differ from two's complement?
The Professor doesn't talk at all about 8-bit, 16-bit, 64-bit, etc., which I saw a lot of while reading up on this. Where does this distinction come from, and does Python use one? Or are those designations specific to the program that I might be coding?
A lot of these posts I've only reference how Python stores integers. Is that suggesting that it stores floats a different way, or are they just speaking broadly?

As I wrote this all up, I sort of realized that maybe I'm diving into the deep end before learning how to swim, but I'm curious like that and like to have a deeper understanding of stuff before moving on.

`bin()` does not return binary bits, it returns a string representation of a binary number. — user3483203, Mar 10 '18 at 00:14
Python does not use fixed-size integers, (i.e. 32-bit or 64-bit). And, as `chrisz` said, `bin` returns a string representing the integer in the *binary-numeral system*. This is distinct from the way the actual underlying bits that python uses to represent the integer. And yes, anyway, Python `int` objects do not use two's complement. — juanpa.arrivillaga, Mar 10 '18 at 00:24
See https://en.wikipedia.org/wiki/64-bit_computing for an idea on n-bit origin. I remember reading some old Intel CPU datasheets that had 4-bit. Usually available CPU registers bit width, or width of memory addresses when specified by a register, or data bus widths... — progmatico, Mar 10 '18 at 00:29
If you're wondering how ints are actually stored (in recent versions of CPython), see the source, specifically [`longintrep.h`](https://github.com/python/cpython/blob/master/Include/longintrepr.h), and maybe look at [`superhackyinternals`](https://github.com/abarnert/superhackyinternals) to play with the bits from the Python side. But the short version is that (IIRC) it's a list of as many 30-bit "digits" as possible, with the sign on the highest digit. — abarnert, Mar 10 '18 at 01:48
Meanwhile, Jython or IronPython might use something totally different. A wrapper around a native JVM or .NET bigint type might make sense. Or maybe it's using 30-bit "digits" but they're stored in a rope that's usually a contiguous array for huge numbers but not always. As long as it can provide the same interface and roughly the same performance guarantees as CPython's implementation, it's valid. — abarnert, Mar 10 '18 at 01:54
@abarnert: Neat. Thank you for that info on how to look this stuff. — dylosaur, Mar 10 '18 at 02:33
If you've read how `int` works and played around with `superhackyinternals` and you're curious how `float` is different, you might want to look at https://stupidpythonideas.blogspot.com/2015/01/ieee-floats-and-python.html and https://github.com/abarnert/floatextras next. (Assuming you're on a Python implementation that uses IEEE double for floats, but you almost certainly are.) — abarnert, Mar 11 '18 at 19:28

Blender · Accepted Answer · 2018-03-11T23:29:57.657

I thought it was outright wrong that the code suggested just appending a - to the front of a binary number to show its negative counterpart. It looks like bin() does the same thing, but don't you have to flip the bits and add a 1 or something? Is there a reason for this other than making it easy to comprehend/read?

You have to somehow designate the number being negative. You can add another symbol (-), add a sign bit at the very beginning, use ones'-complement, use two's-complement, or some other completely made-up scheme that works. Both the ones'- and two's-complement representation of a number require a fixed number of bits, which doesn't exist for Python integers:

>>> 2**1000
1071508607186267320948425049060001810561404811705533607443750
3883703510511249361224931983788156958581275946729175531468251
8714528569231404359845775746985748039345677748242309854210746
0506237114187795418215304647498358194126739876755916554394607
7062914571196477686542167660429831652624386837205668069376

The natural solution is to just prepend a minus sign. You can similarly write your own version of bin() that requires you to specify the number of bits and return the two's-complement representation of the number.

Was reading here and one of the answers in particular said that Python doesn't really work in two's complement, but something else that mimics it. The disconnect here for me is that Python shows me one thing but is storing the numbers a different way. Again, is this just for ease of use? Is bin() using two's complement or Python's method?

Python is a high-level language, so you don't really know (or care) how your particular Python runtime interally stores integers. Whether you use CPython, Jython, PyPy, IronPython, or something else, the language specification only defines how they should behave, not how they should be represented in memory. bin() just takes a number and prints it out using binary digits, the same way you'd convert 123 into base-2.

Follow-up to that one, how does the 'sign-magnitude' format mentioned in the above answer differ from two's complement?

Sign-magnitude usually encodes a number n as 0bXYYYYYY..., where X is the sign bit and YY... are the binary digits of the non-negative magnitude. Arithmetic with numbers encoded as two's-complement is more elegant due to the representation, while sign-magnitude encoding requires special handling for operations on numbers of opposite signs.

The Professor doesn't talk at all about 8-bit, 16-bit, 64-bit, etc., which I saw a lot of while reading up on this. Where does this distinction come from, and does Python use one? Or are those designations specific to the program that I might be coding?

No, Python doesn't define a maximum size for its integers because it's not that low-level. 2**1000000 computes fine, as will 2**10000000 if you have enough memory. n-bit numbers arise when your hardware makes it more beneficial to make your numbers a certain size. For example, processors have instructions that quickly work with 32-bit numbers but not with 87-bit numbers.

A lot of these posts I've only reference how Python stores integers. Is that suggesting that it stores floats a different way, or are they just speaking broadly?

It depends on what your Python runtime uses. Usually floating point numbers are like C doubles, but that's not required.

Thanks for all the info. Couple more questions: 1. So is sign-magnitude the same as one's complement? Or is it just a similar idea? 2. Can you elaborate a bit more, or do you have a link that speaks on what you said about floating point numbers? 3. Does a lot of this boil down to not knowing how the sausage is made, just eating it? I'm not going to know the internals here, so I should just focus on how it outputs? — dylosaur, Mar 10 '18 at 01:41
@dylosaur: It's similar, look at the Wikipedia page for [signed number representations](https://en.wikipedia.org/wiki/Signed_number_representations#Ones'_complement), specifically the end of that section. You can definitely read [how CPython implements](https://github.com/python/cpython/blob/ba85d69a3e3610bdd05f0dd372cf4ebca178c7fb/Include/longintrepr.h#L85-L88) all of these numerical data types if you're interested, but the way they're implemented (which can change at any time) doesn't really change how your Python script is executed. — Blender, Mar 10 '18 at 02:04
It's worth noting that Python doesn't just not define a _particular_ size for integers, it doesn't have _any_ maximum size at all. You can use `int` for numbers big enough for crypto, or even bigger than that; it'll take more memory, and get slower, but you'll never have to worry about wraparound, or `OverflowError` (unless you explicitly do something like (`int(float('inf'))`), etc. — abarnert, Mar 11 '18 at 19:24

score 1 · Answer 2 · answered Mar 10 '18 at 00:24

don't you have to flip the bits and add a 1 or something?

Yes, for two complement notation you invert all bits and add one to get the negative counterpart.

Is bin() using two's complement or Python's method?

Two's complement is a practical way to represent negative number in electronics that can have only 0 and 1. Internally the microprocessor uses two's complement for negative numbers and all modern microprocessors do. For more info, see your textbook on computer architecture.

how does the 'sign-magnitude' format mentioned in the above answer differ from two's complement?

You should look what this code does and why it is there:

while num > 0:
    result = str(num % 2) + result
    num = num // 2

Need help understanding binary conversion in Python

2 Answers2