0

I have a weird bug (?) when reading a csv with read_csv function. Some of the numbers (in my concrete case in 11 lines from a total of 500) are read with many trailing zeros and a seemingly random number at the end.

For example, for a value that is "0.052" in the csv, when I run pandas I get this:

values = pd.read_csv(filename, header=2)
values.column1[487]
0.052000000000000005

This is happening just for some columns, others are read normally.

Any ideas of what is going on here?

Vaziri-Mahmoud
  • 152
  • 1
  • 10
Boris
  • 105
  • 1
  • 5
  • Could it be a problem with the data type? Have you tried specifying what type to read them in as, e.g. Float, int etc? – TaxpayersMoney Sep 22 '20 at 14:41
  • Also, how are you inspecting your CSV? Through a text viewer like Vim / less / notepad, or more advanced software like Excel? Could be that the value is actually what you see in Python, but viewer software rounds it down – Unknown artist Sep 22 '20 at 15:00

1 Answers1

0

It probably is the data type. Specifying the datatype will solve it. If you just want to change the representation use:

pd.set_option("display.precision", *number of numbers behind the comma*)
pd.set_option("display.precision", 3)

If you would want to visualize it to 0.052. Put this. pd.set_option before the output (preferably at the top). NOTE: This only shows 0.052 but pandas still calculates with 0.052000005 which in most cases isn't relevant. But in your case it might.

Hestaron
  • 190
  • 1
  • 8