How to correctly drop rows based on multiple conditions

Asked Nov 27 '19 at 22:34

Active Nov 27 '19 at 22:35

Viewed 17 times

I have some erroneous data points in my dataset that I need to get rid of (see image, it's very obvious there). So I need to drop rows based on dual condition - when column A is greater or equal 0.5 AND column B equals to 0.

So I tried:

df = df.drop(df[df['A'] >= 0.5 & df['B'] == 0].index, inplace=True)

This results in an error:

cannot compare a dtyped [float64] array with a scalar of type [bool]

I then tried to create a mask and drop rows this way:

mask = (df['A'] >= 0.5) & (df['B'] == 0)
df = df.drop(df[mask], axis = 1)

This for some reason results in all my data getting deleted save for the index column.

How do I do this properly? Thanks in advance!

edited Nov 27 '19 at 22:35

Rob

14,746
28
47
65

asked Nov 27 '19 at 22:34

NotAName

3,821
2
29
44

1

`df = df[(df['A'] <= 0.5) & (df['B'] != 0)]` or in your case: `df = df[~mask]` – Erfan Nov 27 '19 at 22:51
Thanks! This worked! Does "~" here means to invert selection? – NotAName Nov 27 '19 at 23:07
1

Yes exactly! The link of the duplicate question has tons of valuable information. I suggest you take a ready and also look at the `.query` method. – Erfan Nov 27 '19 at 23:17

How to correctly drop rows based on multiple conditions

0 Answers0