Numpy `.astype` rounding up

Question

Similar to Numpy astype rounding to wrong value but that seemed like the opposite issue and is actually what I want (truncating). In my real world case I'm doing various calculations where some values could get very very close to the next whole number and then get converted to integers. I want the numbers to be truncated and I expect it to be equivalent to the floor operation. I end up using the results as indexes. However, it seems to be rounding up when I do .astype(np.int32). What's going on here:

In [2]: import numpy as np

...

In [49]: np.array([4319.9997], dtype=np.float32).astype(np.int32)
Out[49]: array([4319], dtype=int32)

In [50]: np.array([4319.9998], dtype=np.float32).astype(np.int32)
Out[50]: array([4320], dtype=int32)

I understand 32-bit floating versus 64-bit floating precision, but I don't understand the internal operations of what astype is doing here.

Look at the `float32` value (without the `int` step), As I step the last digit I get `array(4319.999, dtype=float32)`, then `.9995` and `4320.` — hpaulj, Sep 01 '23 at 01:07
A web search tells me "A 32-bit float has about 7 digits of precision and a 64-bit double has about 16 digits of precision." Your 7/8 is in the 8th digit. — hpaulj, Sep 01 '23 at 01:16

djhoese · Answer 1 · 2023-09-01T14:38:16.647

Repeating what was said in the comments. The 32-bit version of "4319.9997" is actually closer to "4319.9995". When numpy/Python/C tries to convert "4319.9998" from a 64-bit float to a 32-bit float the only two options are either "4319.9995" or "4320.0" and "4320.0" is closer so it rounds up. I can't say this is exactly how this is happening, but it makes some sense to me.

Original Answer:

I can't say I fully understand what I'm about to answer, but some of this makes sense. I think it comes down to how Python (or C or something else) converts the string literal to a 32-bit float.

I took the binary printing function from:

https://stackoverflow.com/a/16444778/433202

import struct
def binary(num):
    return ''.join('{:0>8b}'.format(c) for c in struct.pack('!f', num))

And used it to print out the numbers as 32-bit IEEE floats:

In [62]: binary(4319.9997)
Out[62]: '01000101100001101111111111111111'

In [63]: binary(4319.9998)
Out[63]: '01000101100001110000000000000000'

So that's 0 for sign, 10001011 for the exponent portion, and 000011011111...for the fractional significand portion.

So when those string literals I'm entering get converted to a double, it hits some threshold and the fractional portion gets a 1 added to it which rolls all the bits up into the whole number portion of the significand.

The big misconception/misunderstanding of this whole thing for me is that when I was told that casting a float to int would "truncate" the fractional portion of the number I assumed it was doing this in base-10, but it (C?) is actually doing it in base-2. This makes obvious sense, but I had never thought about it until this issue.

The part I still don't understand is why the conversion from the float string literal that I type "4319.9998" gets bumped to the next number (+1). Why not accept the precision issue and keep it as the same value as "4319.9997"? I made a 64-bit (double) version of the binary function and when I print out these two versions of the numbers:

In [91]: binary64(4319.9997)
Out[91]: '0100000010110000110111111111111111101100010101101101010111010000'

In [92]: binary64(4319.9998)
Out[92]: '0100000010110000110111111111111111110010111001001000111010001010'

If you count the bits and separate things out for 64-bit floating point representation, both values have many 1s after the "whole number" portion of the signficand (after shifting the . over exponent number of digits) so I'm not sure why one would be rounded up and the other not.

"Why not accept the precision issue and keep it as the same value as "4319.9997"?" - there's no link between "accept the precision issue" and "keep it as the same value as "4319.9997"". It's not like what NumPy is doing is *rejecting* the precision issue. NumPy is rounding each float to the closest float32, and the closest float32 happens to be different for these values. — user2357112, Sep 01 '23 at 14:28
Ah! Just checked and "4319.9997" is actually "4319.9995" (or very close to it) so indeed "4320.0" is closer to "4319.9998" than "4319.9995" would have been. That type of rounding makes sense to me. — djhoese, Sep 01 '23 at 14:35

Numpy `.astype` rounding up

1 Answers1