I have a dataset, which I need to groupby() and find the count of each unique combination.
body-style make
0 convertible alfa-romeo
1 convertible alfa-romeo
2 hatchback alfa-romeo
3 sedan audi
4 sedan audi
My need is to produce an output as shown below:
make body-style count
0 alfa-romero convertible 2
1 alfa-romero hatchback 1
2 audi sedan 2
Tried the below code:
body = pd.DataFrame({'make':['alfa-romeo','alfa-romeo','alfa-romeo','audi','audi'], 'body-style':['convertible','convertible','hatchback','sedan','sedan']})
body.groupby(by=['make','body-style'], as_index=False).count()
This aggregation throws up "list index out of range" error. However, when I remove either of the columns from groupby clause, it is able to give me counts grouped by the remaining column.
If I remove as_index=False, there is no error, but the resultant object will have both columns - make and body-style as part of the index and there won'nt be any count data.
I can add another column to the datframe, fill it with 1s and take a sum()
instead of count()
on the groupby. But would like to know if there is a cleaner way to do this.