1

I have a field ID, in MySQL Workbench it's Bigint(20) type, and if I query in the workbench, it looks like this:

ID
1111111111
2222222222
3333333333
...

If I use pandas.read_sql() then print out this column, it looks like this:

ID
1.111111e+09
2.222222e+09
3.333333e+09
...

and by checking df.dtypes, this columns is float64 type, if I try to convert it to str by using: df['ID'] = df['ID'].astype('str'), this column will become like this:

ID
1111111111.0
2222222222.0
3333333333.0

I'm confused... I thought the Bigint(20) after read_sql() will become int64, but why here it becomes float64, is there a way I can just converted it to int64 or str and it looks like 1111111111 (without the float .0?)

I also tried df['ID'] = df['ID'].astype('int64'), but it gave me error ValueError: Cannot convert non-finite values (NA or inf) to integer

Can someone help please... Thanks.

wawawa
  • 2,835
  • 6
  • 44
  • 105
  • Does the `ID` column happen to have `NULL` values? – Nikolaos Chatzis Jul 28 '21 at 16:09
  • ID has null values, per the `ValueError`. Float is used instead of `int` because `int` doesn't support null values. Consider, if you really want integers, using panda's `'Int64'` dtype extension. Cf https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html. – ifly6 Jul 28 '21 at 16:12
  • what @ifly6 said, and also take a look at this https://stackoverflow.com/questions/37796916/pandas-read-sql-integer-became-float – Nikolaos Chatzis Jul 28 '21 at 16:16
  • Thanks, I didn't realise this is because of the NULL values, I used `SELECT Cast(ID as char) AS ID,` instead of `SELECT ID` in my query seems resolved the issue, is there any downside? – wawawa Jul 28 '21 at 16:47
  • oh I found out there's only one NULL value, so if I add a condition in the query to exclude it, seems like this column is converted to `int64` now. – wawawa Jul 28 '21 at 16:51

0 Answers0