0

I have a mock-up dataframe below and resembles very closely to my original dataframe

sof = pd.DataFrame({'id':['1580326032400705442181105','15803260000063243713608360','1580326343500677412104013','15803260343000000705432103406'],'class':['a','c','c','d']})

When i write this dataframe to destop using the 'to_csv' function, i see the ids automatically being converted to the scientific format.(example : 1.5803260324007E+24) I have a few questions on this

  1. why does python convert this column (obviously of type 'obj') to a numberic format?
  2. How do i preserve my format?

I have tried the following

sof.to_csv('path',float_format='%f',index = False) 

Doesnt seem to change anything

sof['id'].astype(int).astype(str) 

Trying to convert the supposed "float" to int and then to string

It gives the following error : OverflowError: Python int too large to convert to C long

Can i get some guidance on how this can be achieved?

Chris
  • 15,819
  • 3
  • 24
  • 37
  • https://stackoverflow.com/questions/7604966/maximum-and-minimum-values-for-ints – Chris Nov 07 '22 at 15:12
  • Those are going to have to be treated as strings, they are too large to be represented as integers. The sample dataframe you posted already has them as strings, and when written to csv they won't change format at all. – Chris Nov 07 '22 at 15:12
  • If i write this dataframe to csv, i see scientific formatting and not strings – Pradeep Chintapalli Nov 07 '22 at 15:23
  • You must not be using the sample dataframe you shared as part of your question. – Chris Nov 07 '22 at 15:27
  • i am. i checked it again. Can this be a setting that needs to be changed in excel? – Pradeep Chintapalli Nov 07 '22 at 15:30
  • Don't open it in excel to validate your data, open it notepad++ - excel will automatically infer these as integers and apply scientific notation. If you must open in excel, perhaps try `Data -> From Text/CSV` inside of excel and specify data types as you import them – Chris Nov 07 '22 at 15:33

0 Answers0