You can use groupby
with size
for getting length of all categories in columns author
and category
- output is Series
with MultiIndex
.
print (df.groupby(['author','category']).size())
author category
A movies 2
B games 2
C movies 1
pics 1
dtype: int64
Then add reset_index
for creating columns from MultiIndex
and set column name for value column - output is DataFrame
:
df = df.groupby(['author','category']).size().reset_index(name='category count')
print (df)
author category category count
0 A movies 2
1 B games 2
2 C movies 1
3 C pics 1
But if need crosstab
there is multiple solutions:
#add unstack for reshape
df1 = df.groupby(['author','category']).size().unstack(fill_value=0)
print (df1)
category games movies pics
author
A 0 2 0
B 2 0 0
C 0 1 1
df1 = pd.crosstab(df['author'],df['category'])
print (df1)
category games movies pics
author
A 0 2 0
B 2 0 0
C 0 1 1
df1 = df.pivot_table(index='author',columns='category', aggfunc='size', fill_value=0)
print (df1)
category games movies pics
author
A 0 2 0
B 2 0 0
C 0 1 1
EDIT:
What is the difference between size and count in pandas?