How to drop rows as dependant on multiple columns conditions in pandas?

Question

Suppose I have the folowing pandas dataframe:

df = pd.DataFrame({"#":['a','a','a','a','b','b','b','c','c','d','d'],
              "$":[8,5,7,20,4,6,10,8,8,9,9],
              "C":[0,1,2,3,1,0,2,1,0,1,0]})


#   $   C
0   a   8   0
1   a   5   1
2   a   7   2
3   a   20  3
4   b   4   1
5   b   6   0
6   b   10  2
7   c   8   1
8   c   8   0
9   d   9   1
10  d   9   0

I am looking to drop the rows where values from '#' only have a 0 or 1 correspondent in column C, in this case, delete the row with 0 and keep the one with 1. So in my case, only values c and d from column # have only 0 and 1 in column C. Therefore I would have to delete the rows where c and d have 0 in c and keep the 1. I cannot do the same for a and b because they mapped to other values than 0 and 1 in C. It has to be 0 and 1 only.

I have accidentaly stumbled over this working solution by trying out multiple things.

df.groupby(['#','$'], as_index=False)['C'].agg({'C':'max'})

This option however,I cannot use because in my real dataset I have multiple columns that contain null values and grouping will not work in my case. Therefore, I am looking for a different solution.

The solution dataframe will look like this:

    #   $   C
0   a   5   1
1   a   7   2
2   a   8   0
3   a   20  3
4   b   4   1
5   b   6   0
6   b   10  2
7   c   8   1
8   d   9   1

Use `df.loc[df.groupby(['#','$'])['C'].idxmax()]`, if still some problem, let me know. — jezrael, Jun 20 '19 at 08:26
@jezrael, ok, that worked. If you post it as an answer, I will mark it as correct. — callmeGuy, Jun 20 '19 at 13:41
unfortuantely already answered, so rather not. But thank you. — jezrael, Jun 20 '19 at 13:42
@jezrael, i would say not quite a duplicate because in my case the plain vanilla group by does not work. At least from a solutions perspective they differ. either way — callmeGuy, Jun 20 '19 at 13:56

How to drop rows as dependant on multiple columns conditions in pandas?

0 Answers0