I have a pandas df like this
student_id | A | B |
---|---|---|
1 | 3 | 13 |
2 | 4 | 23 |
1 | 5 | 12 |
4 | 28 | 32 |
1 | 38 | 12 |
2 | 21 | 14 |
My desired output: I want to drop the duplicates, and count how many duplicates there are according to student_id and keeping the last record/row and append the count column as new column, also average the duplicated rows entry in A and B as new columns
student_id | A | B | count | average A rounded | average B rounded |
---|---|---|---|---|---|
1 | 38 | 12 | 3 | 15 | 12 |
2 | 21 | 14 | 2 | 13 | 19 |
4 | 28 | 32 | 1 | 28 | 32 |