0

I have a dataframe which looks like this:

pd.DataFrame({"id": [1,2,2,1], "col1": ['a', 'b', 'c', 'b'], "col2": ['z', 'c', 'a', 'd'], "col3": ['d','a','d','z']})

    id  col1    col2    col3
0   1   a       z       d
1   2   b       c       a
2   2   c       a       d
3   1   b       d       z

I would like to transform it into this:

    id  a   b   c   d   z
0   1   1   1   0   2   2
1   2   2   1   2   1   0

In other words, I would like, for each id, the number of occurences of each value in col1, col2 and col3.

I found this almost similar problem (Pivot Tables or Group By for Pandas?) but as I understand it doesn't work for multiple columns.

qwertzuiop
  • 685
  • 2
  • 10
  • 24
  • https://stackoverflow.com/questions/22888434/pandas-count-values-in-each-column-of-a-dataframe – BENY May 06 '20 at 14:32

1 Answers1

1

Use DataFrame.melt with crosstab, last data cleaning DataFrame.rename_axis and DataFrame.reset_index:

df1 = df.melt('id')
df = pd.crosstab(df1['id'], df1['value']).rename_axis(None, axis=1).reset_index()
print (df)
   id  a  b  c  d  z
0   1  1  1  0  2  2
1   2  2  1  2  1  0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252