How to Compare if the Values between 2 Columns have around the Same Number ~ Pandas

Question

I have a df, where I'm trying to compare 2 columns, and if they have around the same value in the same row, I want it to be dropped from the df. i.e.:

   A       B  
1  3.21   3.15
2  6.98   2.07
3  5.41   8.95
4  0.32   0.30

I would want only rows 2/3 to remain in the df, because in rows 1/4 A and B are similar to each other.

I've tried to do something like if i in column A is within a range (+/- 15% of the value of row B) remove that row, but it didn't work. Didn't know if there was some sort of built in function that pandas had for that.

looks like you want to conditionally drop rows, is that correct? This previous post may help https://stackoverflow.com/questions/13851535/how-to-delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression I think it would look something like `df.drop(df[((df.A -df.B)/ df.B) < .15].index)` — bartius, Nov 05 '21 at 04:32

score 4 · Accepted Answer · answered Nov 05 '21 at 04:32

4

You could do this by passing rtol parameter to numpy.isclose:

result = df[~np.isclose(df.A, df.B, atol=0, rtol=0.15)]
#       A     B
# 2  6.98  2.07
# 3  5.41  8.95

answered Nov 05 '21 at 04:32

hilberts_drinking_problem

11,322
3
22
51

score 2 · Answer 2 · answered Nov 05 '21 at 04:35

2

You could define your lower and upper bounds on permissable values

lower = df["A"]*0.85
upper = df["A"]*1.15

and then filter using pandas.Series.between

df[~df["B"].between(lower, upper)]

answered Nov 05 '21 at 04:35

Riley

2,153
1
6
16

How to Compare if the Values between 2 Columns have around the Same Number ~ Pandas

2 Answers2