0

What is the explanation for this seemingly inconsistent floating point behavior of pandas.to_numeric?

In [116]: pd.to_numeric([100.018], errors='ignore', downcast='float')
Out[116]: array([100.018], dtype=float32)

In [117]: pd.DataFrame([100.018]).apply(pd.to_numeric, errors='ignore', downcast='float')
Out[117]:
            0
0  100.017998

In [118]: pd.DataFrame([100.018], dtype=np.float64).apply(pd.to_numeric, errors='ignore', downcast='float').dtypes
Out[118]:
0    float32
dtype: object

It seems to me that the downcast is not working correctly with the docs as 100.018 can be casted to a np.float32

If not None, and if the data has been successfully cast to a numerical dtype (or if the data was numeric to begin with), downcast that resulting data to the smallest numerical dtype possible according to the following rules:

  • 'integer' or 'signed': smallest signed int dtype (min.: np.int8)
  • 'unsigned': smallest unsigned int dtype (min.: np.uint8)
  • 'float': smallest float dtype (min.: np.float32)
In [119]: import pandas as pd

In [120]: pd.__version__
Out[120]: '0.23.4'
Alexander McFarlane
  • 10,643
  • 9
  • 59
  • 100

0 Answers0