Creating dummy variables as counts using tidyverse/dplyr

Question

Let's say I have some data as follows:

ID    FRUIT
001   apple
002   grape
001  banana
002   apple
003   apple
001   apple

I would like to make columns out of this, like dummy variables. Except the dummies are counts of the variable in the FRUIT column. So, if ID 001 has apple appear 2 two times in the FRUIT column, the new column apple or FRUIT_apple is 2.

Expected output:

ID   FRUIT_apple  FRUIT_grape  FRUIT_banana
001            2            0             1
002            1            1             0
003            1            0             0

Not attached to these column names, whatever is easier.

`table(df)` https://stackoverflow.com/questions/8186133/faster-ways-to-calculate-frequencies-and-cast-from-long-to-wide for other options. — Ronak Shah, Jul 02 '22 at 04:34

score 1 · Accepted Answer · answered Jul 02 '22 at 04:28

using reshape2 but you could pretty much use any package that lets you reformat from long to wide

    library(reshape2)
    df = dcast(fruitData,ID~FRUIT,length)
   
    > df
    ID apple banana grape
  1  1     2      1     0
  2  2     1      0     1
  3  3     1      0     0

Creating dummy variables as counts using tidyverse/dplyr

1 Answers1