I import a csv data file into a pandas DataFrame df
with pd.read_csv
. The text file contains a column with strings like these:
y
0.001
0.0003
0.0001
3e-05
1e-05
1e-06
If I print the DataFrame, pandas outputs the decimal representation of these values with 6 digits after the comma, and everything looks good.
When I try to select rows by value, like here:
df[df['y'] == value],
by typing the corresponding decimal representation of value, pandas correctly matches certain values (example: rows 0, 2, 4) but does not match others (rows 1, 3, 5). This is of course due to the fact that those rows values do not have a perfect representation in base two.
I was able to workaround this problem is this way:
df[abs(df['y']/value-1) <= 0.0001]
but it seems somewhat awkward. What I'm wondering is: numpy already has a method, .isclose, that is specifically for this purpose.
Is there a way to use .isclose
in a case like this? Or a more direct solution in pandas?