I have a raw dataset like below:
ColA | ColB | duration | interval | Counter |
---|---|---|---|---|
A | SD | 2 | 4 | 1 |
A | SD | 3 | 3 | 2 |
A | UD | 2 | 1 | 10 |
B | UD | 1 | 2 | 2 |
B | UD | 2 | 2 | 2 |
B | SD | 3 | 3 | 13 |
B | SD | 1 | 4 | 19 |
I am expecting an output result like below:
Explanation of the output:
- SumCounter is the sum of counter values on the group by ColA and ColB values.
- AvgdurationSD/UD and AvgIntervalSD/UD are created by taking the average over ColA and ColB and having a 0 value incase the columns dont match the criteria (e.g. AvgDurationSD and AvIntervalSD has 0 value for a group of ColA = A and ColB = UD.
I understand that I have to use group by
and agg
functions to apply here but I am not really sure how to apply conditions for ColB on individual new columns.
Any help is appreciated:)