This is a question for all the Tidyverse experts out there. I have a dataset with lots of different classes (datettime, integer, factor, etc.) and want to use tidyr to gather multiple variables at the same time. In the reproducible example below I would like to gather time_, factor_ and integer_ at once, while id and gender remain untouched.
I am looking for the current best practice solution using any of the Tidyverse functions.
(I'd prefer if the solution isn't too "hacky" as I have a dataset with dozens of different key variables and around five hundred thousand rows).
Example data:
library("tidyverse")
data <- tibble(
id = c(1, 2, 3),
gender = factor(c("Male", "Female", "Female")),
time1 = as.POSIXct(c("2014-03-03 20:19:42", "2014-03-03 21:53:17", "2014-02-21 12:13:06")),
time2 = as.POSIXct(c("2014-05-28 15:26:49 UTC", NA, "2014-05-24 10:53:01 UTC")),
time3 = as.POSIXct(c(NA, "2014-09-26 00:52:40 UTC", "2014-09-27 07:08:47 UTC")),
factor1 = factor(c("A", "B", "C")),
factor2 = factor(c("B", NA, "C")),
factor3 = factor(c(NA, "A", "B")),
integer1 = c(1, 3, 2),
integer2 = c(1, NA, 4),
integer3 = c(NA, 5, 2)
)
Desired outcome:
# A tibble: 9 x 5
id gender Time Integer Factor
<dbl> <fct> <dttm> <dbl> <fct>
1 1 Male 2014-03-03 20:19:42 1 A
2 2 Female 2014-03-03 21:53:17 3 B
3 3 Female 2014-02-21 12:13:06 2 C
4 1 Male 2014-05-28 15:26:49 1 B
5 2 Female NA NA NA
6 3 Female 2014-05-24 10:53:01 4 C
7 1 Male NA NA NA
8 2 Female 2014-09-26 00:52:40 5 A
9 3 Female 2014-09-27 07:08:47 2 B
P.S. I did find a couple of threads that scratch the surface of gathering multiple variables, but none deal with the issue of gathering different classes and describe the current state of the art Tidyverse solution.