0

I'm processing a csv file. Source file contain value as '20190801'. Pandas detects it as int or float for different files. But before writing the output, I convert all columns to string and datatype shows all columns as object. But the output containing .0 at the end. Why is that?

e.g: 20190801.0

   for col in data.columns:
        data[col] = data[col].astype(str)
    print(data.dtypes) <-- prints all columns datatypes as object

    data.to_csv(neo_path, index=False)
Simson
  • 3,373
  • 2
  • 24
  • 38
Ratha
  • 9,434
  • 17
  • 85
  • 163
  • Numeric strings default to `float` when writing CSV. – Barmar Oct 16 '19 at 23:05
  • @Barmar, how to avoid that?Due to that issue i was changing all columns datatype to string – Ratha Oct 16 '19 at 23:06
  • Set the `dtype` to `int64` or `str` – Barmar Oct 16 '19 at 23:07
  • @Barmar, I think i do that in my above code. Changing all columns datatype to string. astype(str) – Ratha Oct 16 '19 at 23:07
  • I think there are options to `.to_csv()` that will specify the types in the file. – Barmar Oct 16 '19 at 23:09
  • hey! do you you have nulls in your data? "Because NaN is a float, this forces an array of integers with any missing values to become floating point." https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html – the_good_pony Oct 16 '19 at 23:10
  • @the_good_pony no NULLS. all data like numerics convreted as float – Ratha Oct 16 '19 at 23:11
  • @Barmar Any example?i couldn't find. Even though I change the type before writing to a file, again do we need to specify? – Ratha Oct 16 '19 at 23:14
  • @Barmar are you refering to the float_format option in to_csv()? – the_good_pony Oct 16 '19 at 23:17
  • No, that's not it. I helped someone a couple of weeks ago with something similar, but I can't find it now. I'm not really a pandas expert. – Barmar Oct 16 '19 at 23:21
  • Related: https://stackoverflow.com/questions/42543131/pandas-automatically-converting-my-string-column-to-float This is about reading excel files. I can't find something similar when writing CSV. – Barmar Oct 16 '19 at 23:23
  • @Barmar I added the answer – Ratha Oct 17 '19 at 00:10

1 Answers1

0

I fixed like this; I added converters parameter and making sure all those problematic columns should remain as strings in my case.

 data = pd.read_csv(filepath, converters={"SiteCode":str,'Date':str,'Tank ID':str,'SIRA RECORD ID':str}
....
 data.to_csv(neo_path,index=False)

In this case I get rid of, converting all column types as string as pointed in my quetsion.

 for col in data.columns:
        data[col] = data[col].astype(str) 

: This didnt work when writing the output to csv. It converts string back again to float

Ratha
  • 9,434
  • 17
  • 85
  • 163