2

I have a dataset, which I need to groupby() and find the count of each unique combination.

    body-style        make
0  convertible  alfa-romeo
1  convertible  alfa-romeo
2    hatchback  alfa-romeo
3        sedan        audi
4        sedan        audi

My need is to produce an output as shown below:

             make   body-style   count
0     alfa-romero  convertible       2
1     alfa-romero    hatchback       1
2            audi    sedan           2

Tried the below code:

body = pd.DataFrame({'make':['alfa-romeo','alfa-romeo','alfa-romeo','audi','audi'], 'body-style':['convertible','convertible','hatchback','sedan','sedan']})

body.groupby(by=['make','body-style'], as_index=False).count()

This aggregation throws up "list index out of range" error. However, when I remove either of the columns from groupby clause, it is able to give me counts grouped by the remaining column.

If I remove as_index=False, there is no error, but the resultant object will have both columns - make and body-style as part of the index and there won'nt be any count data.

I can add another column to the datframe, fill it with 1s and take a sum() instead of count() on the groupby. But would like to know if there is a cleaner way to do this.

Justin
  • 53
  • 1
  • 6
  • Check my answer with this problem from [this](https://stackoverflow.com/a/32307259/2901002) – jezrael May 22 '18 at 10:55
  • 2
    If you add both make and body style as grouper, there is nothing left to count. Count is a special method that tallies non null values in a column. What you want is probably `.size()` – ayhan May 22 '18 at 10:55
  • This seems like a very good question to me. It has a reproducible example, op's attempt, clear definition of the problem yet it is downvoted. Is there something I missed? – ayhan May 22 '18 at 10:58

1 Answers1

11

you can use GroupBy.count() only if you have at least one column that hasn't been used for grouping.

In case you group by all columns in the DF - use .size() instead:

In [119]: body.groupby(['make','body-style']).size().reset_index(name='count')
Out[119]:
         make   body-style  count
0  alfa-romeo  convertible      2
1  alfa-romeo    hatchback      1
2        audi        sedan      2
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419