4

I would like to convert a NumPy (version 1.11.0) array from float64 to int64. I want this operation to be able to be done on whole numbers but fail on non-whole numbers.

My understanding was that I could use casting=safe though clearly my understanding was wrong...

I would hope that the following would work:

np.array([1.0]).astype(int, casting='safe')

And that this would fail:

np.array([1.1]).astype(int, casting='safe')

However they both fail with this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-51-7261ddf00794> in <module>()
    1 import numpy as np
    2 print(np.__version__)
----> 3 np.array([1.0]).astype(int, casting='safe')

TypeError: Cannot cast array from dtype('float64') to dtype('int64') according to the rule 'safe'

I'm guessing I have a fundamental misunderstanding of what safe casting means, so perhaps that is not the best way to achieve this, is there a better way for the first example to work but the second to fail?

Jason
  • 2,278
  • 2
  • 17
  • 25
johnchase
  • 13,155
  • 6
  • 38
  • 64
  • 2
    I believe `safe` is for casts in which no information is lost. It isn't an operation that cares about the value, only the type. In other words `float` -> `int` is not a safe transformation for the entire domain of `float`. The reverse should be safe though. – ebolyen Apr 08 '16 at 21:50

2 Answers2

4

I don't know of a way to do this directly in numpy. The fastest way I could find to convert a float dtype array of whole numbers into an int dtype array is:

import numpy as np

def float_to_int(array):
    int_array = array.astype(int, casting='unsafe', copy=True)
    if not np.equal(array, int_array).all():
        raise TypeError("Cannot safely convert float array to int dtype. "
                        "Array must only contain whole numbers.")
    return int_array

Testing for correctness:

In [3]: whole_numbers = np.arange(1000000, dtype=float)

In [4]: fractional = np.arange(0, 100000, 0.1, dtype=float)

In [5]: float_to_int(whole_numbers)
Out[5]: array([     0,      1,      2, ..., 999997, 999998, 999999])

In [6]: float_to_int(fractional)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-0a7807a592b7> in <module>()
----> 1 float_to_int(fractional)

<ipython-input-2-953668ae0922> in float_to_int(array)
      2         int_array = array.astype(int, casting='unsafe', copy=True)
      3         if not np.equal(array, int_array).all():
----> 4                 raise TypeError("Cannot safely convert float array to int dtype. "
      5                                 "Array must only contain whole numbers.")
      6         return int_array

TypeError: Cannot safely convert float array to int dtype. Array must only contain whole numbers.

Based on https://stackoverflow.com/a/35042794/3776794 and https://stackoverflow.com/a/7236784/3776794 I tried an implementation using np.modf but this is slower than the solution above:

def float_to_int_mod(array):
    mod_array, int_array = np.modf(array)
    if not np.equal(mod_array, 0).all():
        raise TypeError("Cannot safely convert float array to int dtype. "
                        "Array must only contain whole numbers.")
    return int_array.astype(int, casting='unsafe', copy=True)

Runtime comparison:

In [8]: %timeit float_to_int(whole_numbers)
100 loops, best of 3: 2.75 ms per loop

In [9]: %timeit float_to_int_mod(whole_numbers)
100 loops, best of 3: 8.74 ms per loop

This is tested with Python 3 and numpy 1.11.0.

Community
  • 1
  • 1
jairideout
  • 661
  • 4
  • 11
2

You're almost there:

In [7]: a = np.array([1.1, 2.2, 3.3])

In [8]: a.astype(int) == a
Out[8]: array([False, False, False], dtype=bool)

In [9]: if (a.astype(int) != a).any():
   ...:     raise ValueError("Conversion failed")
   ...: 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-3628e8ae2ce8> in <module>()
      1 if (a.astype(int) != a).any():
----> 2     raise ValueError("Conversion failed")
      3 

ValueError: Conversion failed

In [10]: b = np.array([1.0, 2.0, 3.0])

In [11]: b.astype(int) == b
Out[11]: array([ True,  True,  True], dtype=bool)

"Safe" vs "unsafe" casting only refers to dtypes, not values. So, yes, a conversion from float64 to int64 is "unsafe" because there exists at least one float number which cannot be losslessly cast into an integer.

ev-br
  • 24,968
  • 9
  • 65
  • 78