4

I'm trying to reshape my data from long to wide, but here I need to create names column such as event1, event2, event3, etc. In other words, there's no natural candidate for names_from argument. I've tried a couple of different ways but cannot get what I'm looking for- Here's a reproducible example.

set.seed(57)
df <- data.frame(date = seq.Date(as.Date("2009-01-01"), as.Date("2009-01-12"), by = 1),
                 id = rep(1:3, each = 4),
                 val = rnorm(12)) %>% filter(val > 0.5)

I want to convert df to df2.

df2 <- data.frame(id = c(1:3),
              event1 = c("2009-01-03", "2009-01-06", "2009-01-10"),
              event2 = c("2009-01-04", "2009-01-07", "2009-01-11"),
              event3 = c(" ", " ", "2009-01-12"))

Note that this data set documents the first, second, and third (if any) occurrence date for each id.

Here's what I've tried using names_prefix but doesn't seem to work.

set.seed(57)
df <- data.frame(date = seq.Date(as.Date("2009-01-01"), as.Date("2009-01-12"), by = 1),
             id = rep(1:3, each = 4),
             val = rnorm(12)) %>% filter(val > 0.5) %>% 
pivot_wider(names_prefix = "event", names_from = val, values_from = date)
alistaire
  • 42,459
  • 4
  • 77
  • 117
qnp1521
  • 806
  • 6
  • 20

2 Answers2

5

We could create a new id column and then use pivot_wider.

library(dplyr)
df %>%
  group_by(id) %>%
  mutate(row = row_number()) %>%
  select(-val) %>%
  tidyr::pivot_wider(names_from = row, values_from = date, names_prefix = 'event')

#     id event1     event2     event3    
#  <int> <date>     <date>     <date>    
#1     1 2009-01-03 2009-01-04 NA        
#2     2 2009-01-06 2009-01-07 NA        
#3     3 2009-01-10 2009-01-11 2009-01-12

Using data.table :

library(data.table)
dcast(setDT(df), id~rowid(id), value.var = 'date')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

For base R enthusiasts:

df$time <- sequence(rle(df$id)$lengths)
reshape(df, direction="wide", v.names="date")

#  id       val     date.1     date.2     date.3
#1  1 0.6225360 2009-01-03 2009-01-04       <NA>
#3  2 1.6256636 2009-01-06 2009-01-07       <NA>
#5  3 0.9514475 2009-01-10 2009-01-11 2009-01-12

Note that val comes from the first record of each id group.

Edward
  • 10,360
  • 2
  • 11
  • 26