0

Suppose I have the folowing pandas dataframe:

df = pd.DataFrame({"#":['a','a','a','a','b','b','b','c','c','d','d'],
              "$":[8,5,7,20,4,6,10,8,8,9,9],
              "C":[0,1,2,3,1,0,2,1,0,1,0]})


#   $   C
0   a   8   0
1   a   5   1
2   a   7   2
3   a   20  3
4   b   4   1
5   b   6   0
6   b   10  2
7   c   8   1
8   c   8   0
9   d   9   1
10  d   9   0

I am looking to drop the rows where values from '#' only have a 0 or 1 correspondent in column C, in this case, delete the row with 0 and keep the one with 1. So in my case, only values c and d from column # have only 0 and 1 in column C. Therefore I would have to delete the rows where c and d have 0 in c and keep the 1. I cannot do the same for a and b because they mapped to other values than 0 and 1 in C. It has to be 0 and 1 only.

I have accidentaly stumbled over this working solution by trying out multiple things.

df.groupby(['#','$'], as_index=False)['C'].agg({'C':'max'})

This option however,I cannot use because in my real dataset I have multiple columns that contain null values and grouping will not work in my case. Therefore, I am looking for a different solution.

The solution dataframe will look like this:

    #   $   C
0   a   5   1
1   a   7   2
2   a   8   0
3   a   20  3
4   b   4   1
5   b   6   0
6   b   10  2
7   c   8   1
8   d   9   1
Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
callmeGuy
  • 944
  • 2
  • 11
  • 28

0 Answers0