4

I have a requirement where I need to convert the rows of a dataframe column to columns, however I am facing an issue after GROUPBY. Below is a set of 3 users that can have types between type1 to type6.

user_id1    type4
user_id1    type6
user_id1    type1
user_id1    type2
user_id1    type1
user_id1    type6
user_id2    type1
user_id2    type2
user_id2    type2
user_id2    type1
user_id2    type3
user_id2    type4
user_id2    type5
user_id2    type6
user_id2    type2
user_id2    type6
user_id3    type1
user_id3    type2
user_id3    type3
user_id3    type2

The output I am expecting is -

user_id   type1 type2   type3   type4   type5   type6
user_id1    2    1       0       1       0       2
user_id2    2    3       1       1       1       2
user_id3    1    2       1       0       0       0

I tried to do a groupby on the type and got the count.But not sure how to convert to column especially the missing types should be populated with 0.

Thanks a lot for your time.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
Suraj
  • 575
  • 1
  • 9
  • 23

1 Answers1

7

What you need to use is pivot_table from pandas. You can specify what rows and columns you need, fill_value states what do you want to do with empty values and aggfunc len counts.

I'm not sure what your DataSeries looks like, but you need sth like this:

pd.pivot_table(data, index='user_id', columns='type', aggfunc=len, fill_value=0)
Martyna
  • 212
  • 1
  • 7