I'm trying to convert my dataset from wide format to long format, but it's not working as intended. My dataset has columns rowid, arrest1, arrest2, ..., arrest10, lien1, lien2, ..., lien10
and looks something like this:
rowid arrest1 arrest2 ... lien1 lien2 ...
1 1/1/2008 NA 2/2/2009 NA
I'm trying to get a long dataset where I have a time variable that takes on the values 1-10 and separate variables arrest
and lien
that contain the dates. I tried the following code, but my time variable takes on values 0-9 and in addition to the arrest
and lien
variables, there are arrest1
and lien2
. There's definitely something wrong with the names_pattern
argument.
df_long <- df_wide %>%
select(rowid, lien1:lien10, arrest1:arrest10) %>%
pivot_longer(-rowid,
names_to = c(".value", "time"),
names_pattern = "(\\w+).*?(\\d{1,2})")
Here's some sample data:
structure(list(rowid = c(9317L, 31447L, 37939L, 40198L, 19346L
), arrest1 = structure(c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), class = "Date"), arrest2 = structure(c(NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_), class = "Date"), arrest3 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest4 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest5 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest6 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest7 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest8 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest9 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), arrest10 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien1 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien2 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien3 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien4 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien5 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien6 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien7 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien8 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien9 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date"), lien10 = structure(c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "Date")), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))