As per Categorical Data - Operations, by default groupby
will show “unused” categories:
In [118]: cats = pd.Categorical(["a","b","b","b","c","c","c"], categories=["a","b","c","d"])
In [119]: df = pd.DataFrame({"cats":cats,"values":[1,2,2,2,3,4,5]})
In [120]: df.groupby("cats").mean()
Out[120]:
values
cats
a 1.0
b 2.0
c 4.0
d NaN
How to obtain the result with the “unused” categories dropped? e.g.
values
cats
a 1.0
b 2.0
c 4.0