1

I'm using pandas to apply some format level changes on a csv and storing the result in a target file. The source file has some integers, but after pandas operation the integers are converted to decimals. For e.g. 3 in source file converted to 3.0. I would like the integers remain as integers.

Any pointers on how to get this working? Will be really helpful, thank you!

import pandas as pd
  
# reading the csv file
df = pd.read_csv(source)
  
# updating the column value/data
df['Regular'] = df['Regular'].replace({',': '_,'})
  
# writing into the file
df.to_csv(target, index=False)
Vikas
  • 29
  • 1
  • 6
  • 1
    a correct way to call `3.0` is a float. If you have any `nan` values - the datatype is automatically converted from `int` to `float`. There [is a way to convert it to `int`](https://stackoverflow.com/q/21287624/14627505) – Vladimir Fokow Aug 31 '22 at 15:11
  • If that is your problem, then it is already solved here: [Exporting ints with missing values to csv in Pandas](https://stackoverflow.com/q/25789354/14627505) – Vladimir Fokow Aug 31 '22 at 15:19

1 Answers1

0

You can specify data type for pandas read_csv(), eg.:

df = pd.read_csv(source, dtype={'column_name_a': 'Int32', 'column_name_b': 'Int32'})

see docs here :: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Barmar
  • 741,623
  • 53
  • 500
  • 612
Danielle M.
  • 3,607
  • 1
  • 14
  • 31