1

I wanted to create dummy variables for every unique values in a column in R.

My data:

enter image description here

Desired o/p:

enter image description here

Any help would be highly appreciated.

Thanks in advance.

kiran u
  • 49
  • 6
  • 3
    Hey Kiran u, welcome, have a look at https://stackoverflow.com/help/minimal-reproducible-example for creating better questions :-). – Arcoutte Mar 03 '20 at 07:54
  • Does this answer your question? [Generate a dummy-variable](https://stackoverflow.com/questions/11952706/generate-a-dummy-variable) – jay.sf Mar 03 '20 at 08:54

3 Answers3

4

You can create a dummy column and use pivot_wider from tidyr :

library(dplyr)

df %>%
  mutate(n = 1) %>%
  select(-sku_id) %>%
  tidyr::pivot_wider(names_from = sku_name, values_from = n, 
                     names_prefix = 'sku_', values_fill = list(n = 0))

#    id sku_Google sku_AMZ sku_FK sku_AB sku_JIOMART sku_CLIQ sku_AMART
#  <dbl>      <dbl>   <dbl>  <dbl>  <dbl>       <dbl>    <dbl>     <dbl>
#1     1          1       0      1      1           0        0         0
#2     2          0       1      0      0           1        0         0
#3     3          0       0      0      0           0        1         0
#4     4          0       0      0      0           0        0         1

Data

df <- data.frame(id = c(1, 2, 1, 1:4), sku_id  = c(234,345,213,233, 456, 678,657), 
   sku_name = c('Google', 'AMZ', 'FK', 'AB', 'JIOMART', 'CLIQ', 'AMART'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
3

Base R solution:

xtabs( ~ id + sku_name, df1)
#   sku_name
#id  AB AMART AMZ CLIQ FK GOOGLE JIOMART
#  1  1     0   0    0  1      1       0
#  2  0     0   1    0  0      0       1
#  3  0     0   0    1  0      0       0
#  4  0     1   0    0  0      0       0

Data.

df1 <- data.frame(id = c(1,2,1,1,2,3,4),
                  sku_id = c(234, 345, 213, 233, 456, 678, 657),
                  sku_name = c("GOOGLE", "AMZ", "FK", "AB", "JIOMART", "CLIQ", "AMART"))
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
1

dcast from package reshape2.

df <- dcast(id ~sku_name,fun.aggregate="length")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
Prahlad
  • 118
  • 1
  • 4