2

What I want to do is convert data from this format:

ID    Date          CODE
5567  2012-09-02    F17
5567  2012-10-15    F17
5567  2012-11-11    Z49
1245  2010-09-02    Z49
6623  2012-10-15    T65
6623  2012-11-11    Z49

To this format:

ID    F17  Z49  T65
5567  1    1    0
1245  0    1    0
6623  0    1    1

Notice that I do not care about "Date" nor do I care about how many times a specific code is associated with an ID. I only care if a certain ID has a code associated with it 1 or more times. i.e. 1 = yes and 0 = no.

Melderon
  • 365
  • 1
  • 16

3 Answers3

1

You can use dcast from data.table here:

dcast(dt, ID ~ CODE, fun.aggregate = function(x) 1L, fill = 0L)
     ID F17 T65 Z49
1: 1245   0   0   1
2: 5567   1   0   1
3: 6623   0   1   1

data

library(data.table)
dt <- fread("ID    Date          CODE
5567  2012-09-02    F17
5567  2012-10-15    F17
5567  2012-11-11    Z49
1245  2010-09-02    Z49
6623  2012-10-15    T65
6623  2012-11-11    Z49")
markus
  • 25,843
  • 5
  • 39
  • 58
1

Maybe you can try table in base R like below

+(table(df1[-2])>0)

which gives

      CODE
ID     F17 T65 Z49
  1245   0   0   1
  5567   1   0   1
  6623   0   1   1
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

We can use pivot_wider from tidyverse

library(dplyr)
library(tidyr)
df1 %>%
   distinct(ID, CODE) %>% 
   pivot_wider(names_from = CODE, values_from = CODE,
       values_fn = list(CODE = length), values_fill = list(CODE = 0))
# A tibble: 3 x 4
#     ID   F17   Z49   T65
#  <int> <int> <int> <int>
#1  5567     1     1     0
#2  1245     0     1     0
#3  6623     0     1     1

data

df1 <- structure(list(ID = c(5567L, 5567L, 5567L, 1245L, 6623L, 6623L
), Date = c("2012-09-02", "2012-10-15", "2012-11-11", "2010-09-02", 
"2012-10-15", "2012-11-11"), CODE = c("F17", "F17", "Z49", "Z49", 
"T65", "Z49")), row.names = c(NA, -6L), class = "data.frame")
akrun
  • 874,273
  • 37
  • 540
  • 662