Delete the rows matching specific strings in multiple columns with "And" condition

Question

I'm trying to drop the rows matching specific strings in specific columns. I mean delete the row if a "specific string is matched in column A" and as well as "specific string in column B"...so on.

For example,

  Student Science English Maths
0       A    Good    Good  Good
1       B    Poor     Bad   Bad
2       C     Avg    Good   Avg
3       D    Poor    Good   Bad
4       E    Poor     Avg   Avg
5       D    Poor    Good  Good

In the above dataframe, I want to drop the rows where the column.Science=="Poor" & also column.Maths=="Bad".

So the desired output would be

  Student Science English Maths
0       A    Good    Good  Good
2       C     Avg    Good   Avg
4       E    Poor     Avg   Avg
5       D    Poor    Good  Good

I tried

df = df[(~df.Science.str.match('Poor')) & (~df.Maths.str.match('Bad'))]

But it is dropping all the rows matching either of the conditions.

  Student Science English Maths
0       A    Good    Good  Good
2       C     Avg    Good   Avg

``df = df[~((df.Science.str.match('Poor')) & (df.Maths.str.match('Bad')))]`` try this. The negation should be the result of both booleans, not individually. Alternatively, you could use a query : ``df.query("not (Science.str.match('Poor') and Maths.str.match('Bad'))", engine="python")``. It boils down to the same thing, negate the combination of the booleans; dont negate individually. — sammywemmy, Jul 26 '20 at 08:17

score 0 · Answer 1 · answered Jul 26 '20 at 08:20

You can use loc

In [24]: df
Out[24]:
  Student Science English Maths
0       A    Good    Good  Good
1       B    Poor     Bad   Bad
2       C     Avg    Good   Avg
3       D    Poor    Good   Bad
4       E    Poor     Avg   Avg
5       D    Poor    Good  Good

In [25]: df.loc[~((df.Science == "Poor") & (df.Maths == "Bad"))]
Out[25]:
  Student Science English Maths
0       A    Good    Good  Good
2       C     Avg    Good   Avg
4       E    Poor     Avg   Avg
5       D    Poor    Good  Good

Sunny · Accepted Answer · 2020-07-26T08:45:17.003

Try this

df = df[(~df.Science.str.match('Poor')) | (~df.Maths.str.match('Bad'))]

  Student   Science English Maths
0   A       Good    Good    Good
2   C        Avg    Good    Avg
4   E       Poor    Avg     Avg
5   D       Poor    Good    Good

You can also have a look at this Thread to why the odd behaviour takes place. Its because you are giving condition w.r.t what you want to keep in the dataframe and not on what you want to drop

Delete the rows matching specific strings in multiple columns with "And" condition

2 Answers2