I have the following table:
A B C D
75987 1 0 0
75987 1 1 1
75987 2 1 1
75987 2 2 1
75987 2 6 4
75987 1 6 2
75987 1 6 1
59221 2 18 4
59221 1 18 0
59221 2 18 1
I am trying to create column E based upon the following criteria:
- Groupby columns A and B;
- When column C is less than or equal to 6, sum the corresponding values in column D;
- Assign these summed values to column E
I would expect the values in column E to be as below:
A B C D E
75987 1 0 0 4
75987 1 1 1 4
75987 2 1 1 6
75987 2 2 1 6
75987 1 6 2 4
75987 1 6 1 4
75987 2 6 4 6
59221 2 18 4 0
59221 1 18 0 0
59221 2 18 1 0
There are suggestions on how to do implement groupby, boolean indexing and .query from this forum to solve similar issues but I can't seem to adapt them to get it to work as expected. Here for example, but there are plenty others I have tried too.
My attempt below seems logical but I'm still having difficulties.
df['E'] = np.where(df.groupby(['A','B'])(['C'] <= 6))['D'].transform('sum')