Drop groups in groupby that do not contain an element (Python Pandas)

Question

Let a data frame be like the following:

import pandas as pd

df = pd.DataFrame({"name":["A", "A", "B" ,"B", "C", "C"],
                   "nickname":["X","Y","X","Z","Y", "Y"]}

How can I group df and drop those groups (C) that do not contain at least one 'X'?

thank you

score 14 · Accepted Answer · answered Jun 27 '16 at 02:57

14

You can use the grouped by filter from pandas:

df.groupby('name').filter(lambda g: any(g.nickname == 'X')) 

#       name   nickname
# 0        A          X
# 1        A          Y
# 2        B          X
# 3        B          Z

answered Jun 27 '16 at 02:57

Psidom

209,562
33
339
356

1

thank you Psidom. I didn't know about the "any" function – dleal Jun 27 '16 at 03:06
How to drop group if it contains only X – Ankita Patnaik Dec 05 '18 at 09:39
1

As noted in the followup comment to the answer at https://stackoverflow.com/a/54584371/3108762, `filter` isn't a groupby object so if you want to filter and then have the groups you need another `groupby` at the end of the above command. E.g. `df.groupby('name').filter(lambda g: any(g.nickname == 'X')).groupby('name')` – T. Shaffner Mar 28 '19 at 10:39
Also, in my case I had to restructure it more like this to work: `df.groupby('name').filter(lambda g: (g.nickname=='X').any())` Seems like it should be the same to me, maybe an imports issue, but leaving this here for any who follow. – T. Shaffner Mar 28 '19 at 12:36

Drop groups in groupby that do not contain an element (Python Pandas)

1 Answers1

Linked

Related