Let's say I have some data as follows:
ID FRUIT
001 apple
002 grape
001 banana
002 apple
003 apple
001 apple
I would like to make columns out of this, like dummy variables. Except the dummies are counts of the variable in the FRUIT
column. So, if ID 001
has apple
appear 2 two times in the FRUIT
column, the new column apple
or FRUIT_apple
is 2.
Expected output:
ID FRUIT_apple FRUIT_grape FRUIT_banana
001 2 0 1
002 1 1 0
003 1 0 0
Not attached to these column names, whatever is easier.