1
test <- data.frame(
  x=rep(letters[1:3],each=2),
  y=c(4,4,5,5,5,6)
)

  x y
1 a 4
2 a 4
3 b 5
4 b 5
5 c 5
6 c 6

How do i create new columns which contains dummy variables 1 and 0 to indicate the row's observation.

I wish to create something like this.. for column x

  x y x_a x_b x_c
1 a 4   1   0   0
2 a 4   1   0   0
3 b 5   0   1   0
4 b 5   0   1   0
5 c 5   0   0   1
6 c 6   0   0   1

Or for column y

  x y y_4 y_5 x_6
1 a 4   1   0   0
2 a 4   1   0   0
3 b 5   0   1   0
4 b 5   0   1   0
5 c 5   0   1   0
6 c 6   0   0   1

I managed to this is in base R using ifelse in new columns.

I wish to do this in dplyr so it can work on sql tables.

con <- DBI::dbConnect(RSQLite::SQLite(), path = "")
dbWriteTable(con, "test",test)
testdb <- tbl(con, "test")
testdb %>% mutate(i = row_number(), i2 = 1) %>% spread(x, i2, fill = 0) 

the row_number() function do not work on sql tables.

Error: Window function row_number() is not supported by this database. Im using SQLite..

user8542010
  • 113
  • 7

1 Answers1

0

For x:

   library(dplyr)
   test %>% bind_cols(as_data_frame(setNames(lapply(unique(test$x), 
                                                    function(x){as.integer(test$x == x)}), 
                                             paste0('x_', unique(test$x)))))
      x y x_a x_b x_c
    1 a 4   1   0   0
    2 a 4   1   0   0
    3 b 5   0   1   0
    4 b 5   0   1   0
    5 c 5   0   0   1
    6 c 6   0   0   1

For y:

test %>% bind_cols(as_data_frame(setNames(lapply(unique(test$y), 
                                                  function(x){as.integer(test$y == x)}), 
                                           paste0('y_', unique(test$y)))))
  x y y_4 y_5 y_6
1 a 4   1   0   0
2 a 4   1   0   0
3 b 5   0   1   0
4 b 5   0   1   0
5 c 5   0   1   0
6 c 6   0   0   1
Prasanna Nandakumar
  • 4,295
  • 34
  • 63