0

I have a dataframe, which contains column user_pseudo_id and it looks like

user_pseudo_id
2041513012.1676969234
2041513191.1677359234
2041513765.1677359510

And it's a string type. Then I try to save it and load it again. So the problem is that I get the different values in this column.

I get

user_pseudo_id
2041513012.1676967
2041513191.1677360
2041513765.1677360

For writing and loading I use usual code

df.to_csv(path_to_data, index=False)
df = pd.read_csv(path_to_data)

What is the problem and how can I fix it?

Petr Petrov
  • 4,090
  • 10
  • 31
  • 68

1 Answers1

2

Seems like read_csv is automatically converting the data to floats (because it looks like one). You can explicitily define the type of a column using the dtype parameter, like so:

df = pd.read_csv(path_to_data, dtype={'user_pseudo_id': str})

More info in the documentation

robertoia
  • 2,301
  • 23
  • 29