Selecting rows by value in a floating point column in pandas

Question

I import a csv data file into a pandas DataFrame df with pd.read_csv. The text file contains a column with strings like these:

y
0.001
0.0003
0.0001
3e-05
1e-05
1e-06

If I print the DataFrame, pandas outputs the decimal representation of these values with 6 digits after the comma, and everything looks good.

When I try to select rows by value, like here:

df[df['y'] == value],

by typing the corresponding decimal representation of value, pandas correctly matches certain values (example: rows 0, 2, 4) but does not match others (rows 1, 3, 5). This is of course due to the fact that those rows values do not have a perfect representation in base two.

I was able to workaround this problem is this way:

df[abs(df['y']/value-1) <= 0.0001]

but it seems somewhat awkward. What I'm wondering is: numpy already has a method, .isclose, that is specifically for this purpose.

Is there a way to use .isclose in a case like this? Or a more direct solution in pandas?

score 6 · Answer 1 · answered Feb 13 '16 at 06:15

6

Yes, you can use numpy's isclose

df[np.isclose(df['y'], value)]

answered Feb 13 '16 at 06:15

Mike Graham

73,987
14
101
130

Mani Shankar · Answer 2 · 2023-08-12T09:57:37.287

0

You can convert values to int, floating point might not equal.
df.loc[df["sum"].astype(int) == int(value)]

edited Aug 12 '23 at 09:57

answered Aug 06 '23 at 15:31

Mani Shankar

11
2

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 09 '23 at 17:35

Selecting rows by value in a floating point column in pandas

2 Answers2