2

The input dataframe has a long format and contains info for one user into more than one rows.

Example

d_long <- data.frame( nameid = c("sally","sally","sally","Robert","annie","annie"), value = c("product1","ra","ent","ra","ra","product1"))
nameid    value
1  sally product1
2  sally       ra
3  sally      ent
4 Robert       ra
5  annie       ra
6  annie product1

How could it be possible to transform it to a binary dataframe like this:

d_exist <- data.frame(nameid = c("sally","Robert","annie"), product1 = c(1,0,1), ra = c(1,1,1), ent = c(1,0,0))
 nameid product1 ra ent
1  sally        1  1   1
2 Robert        0  1   0
3  annie        1  1   0
user8831872
  • 383
  • 1
  • 14
  • 3
    Try with `table` `table(d_long)` or `reshape2:: dcast(d_long, nameid ~ value, length)` or `library(tidyverse);d_long %>% mutate(n = 1) %>% spread(value, n, fill = 0)` – akrun May 20 '18 at 20:09
  • 1
    @akrun Sorry. I didn't notice your comment before answering. I think your comments are good enough for answer of this question. – MKR May 20 '18 at 20:21
  • 1
    @akrun Thanks much. I should have been watchful before adding answer. I must mentioned you in my answer. – MKR May 20 '18 at 20:26

1 Answers1

1

@akrun has provided quite few good options for this question but one of those option can be to use tidyr::spread to convert in wide-format. Though OP has not clearly mentioned if there can be multiple occurrence of a nameid and value but it would be good to include count for that group. The solution will be as:

library(tidyverse)

d_long %>% group_by(nameid, value) %>%
  mutate(count = n()) %>%
  ungroup() %>%
  spread(value, count, fill = 0) %>%
  as.data.frame()

#   nameid ent product1 ra
# 1  annie   0        1  1
# 2 Robert   0        0  1
# 3  sally   1        1  1
MKR
  • 19,739
  • 4
  • 23
  • 33