1

I am stumped, I have tried half dozen unique ways to convert the columns of my dataframe from float64 to int64. The code below works for a DataFrame created here. but it fails on a dataframe created by my applications:

# result = pd.DataFrame([[1.0,2,3.0], [4,'',7], [None, None, None]])
result.info()
for col in result:   result[col] = x = pd.to_numeric(result[col], errors='coerce', downcast='integer')

when result is constructed as shown here, the conversion works. When I try it a frame from my application the float64 columns remain float64. (I have tried, apply, as type, map solutions, and they all fail to change column type?! Here is the .info() from the frame I am trying to change:

<class 'pandas.core.frame.DataFrame'>
Index: 34 entries, AmzMisc to    Travel
Data columns (total 26 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   Totals   33 non-null     float64
 1   Monthly  33 non-null     float64
 2   2020-02  33 non-null     float64
 3   2020-03  33 non-null     float64
 4   2020-04  33 non-null     float64
 5   2020-05  33 non-null     float64
 6   2020-06  33 non-null     float64
 7   2020-07  33 non-null     float64
 8   2020-08  33 non-null     float64
 9   2020-09  33 non-null     float64
 10  2020-10  33 non-null     float64
 11  2020-11  33 non-null     float64
 12  2020-12  33 non-null     float64
 13  2021-01  33 non-null     float64
 14  2021-02  33 non-null     float64
 15  2021-03  33 non-null     float64
 16  2021-04  33 non-null     float64
 17  2021-05  33 non-null     float64
 18  2021-06  33 non-null     float64
 19  2021-07  33 non-null     float64
 20  2021-08  33 non-null     float64
 21  2021-09  33 non-null     float64
 22  2021-10  33 non-null     float64
 23  2021-11  33 non-null     float64
 24  2021-12  33 non-null     float64
 25  2022-01  33 non-null     float64
dtypes: float64(26)
memory usage: 7.2+ KB

I must be missing something obvious here, since no one else complains of this kind of failure, but I am at a loss.

Dan Oblinger
  • 489
  • 3
  • 15
  • 1
    I see you have 33 non-null entries in a 34 row dataframe. Be aware that you can't convert a float64 column to int64 unless you get rid of the null or use a nullable data type: https://stackoverflow.com/a/51997100/530160 – Nick ODell Feb 03 '22 at 23:24
  • What happens if you try removing rows and/or columns from the problematic data? Can you get it down to a minimal DataFrame that exhibits the problem, which you can re-create in code? Please read https://stackoverflow.com/help/minimal-reproducible-example. Also: did you try reading the [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_numeric.html), particular the part starting with `downcasting will only occur if`? – Karl Knechtel Feb 03 '22 at 23:25
  • "(I have tried, apply, as type, map solutions, and they all fail to change column type?" Could you show the `astype` version? Because that is definitely the tool I would reach for. If that gives you an error, does https://stackoverflow.com/questions/41550746/error-using-astype-when-nan-exists-in-a-dataframe answer the question? – Karl Knechtel Feb 03 '22 at 23:27
  • @NickODell. I thought I could do a conversion that would just ignore non-convertible cells? Is float64 nullable while int64 is not? (my end goal is export this as a CSV without any decimals, so maybe I can convert to type object, but this store ints? – Dan Oblinger Feb 04 '22 at 00:20
  • @KarlKnechtel thanks Karl for the help! Here is one of the as type versions: for col in result: result[col] = result[col].astype(np.int64, errors='ignore') – Dan Oblinger Feb 04 '22 at 00:21
  • 1
    @DanOblinger Yes, float64 is nullable. There is a floating point representation for NaN, which pandas uses to store values equal to None. The int64 type doesn't have an equivalent to NaN. – Nick ODell Feb 04 '22 at 00:24
  • Ugh! @KarlKnechtel this is the problem. But ideally I would keep those nulls in the dataset. Is there a way to upgrade to type object, but then convert all convertible floats into ints? – Dan Oblinger Feb 04 '22 at 00:26
  • Did you try using `.astype` *with the nullable data type* from the question @NickODell linked? – Karl Knechtel Feb 04 '22 at 00:37
  • @KarlKnechtel this worked. while the data was non-null I needed to convert to int64, then to object, in order to have all ints but as a nullable type. THEN I could modify the table to inject nulls and keep the data as ints. As I did not consider the issue of nullability I was at a loss to understand why the conversion would not "stick". I am happy to accept a solution that you post here. Alternatively I will write one up, as I think others could step on this issue. Thanks for this, I was at wits end!! – Dan Oblinger Feb 04 '22 at 01:55

0 Answers0