I have a dataset with 40+ variables, which I'm trying to convert into a long format. Below is an example of what I am working with: (I can't use the original data due to sensitivity)
w_1 = "X20200101"
x_1 = "2020-01-01 05:00:01"
x_2 = "2020-01-01 05:01:01"
x_3 = "2020-01-01 05:02:01"
x_4 = "2020-01-01 05:04:01"
y_1 = "ETI"
y_2 = "EPI"
y_3 = "AUSCULATION"
y_4 = "CPR"
df_wide <- as.data.frame(cbind(w_1,x_1, x_2, x_3,x_4,y_1,y_2, y_3,y_4))
df_wide
w_1 x_1 x_2 x_3 x_4 y_1 y_2 y_3 y_4
1 X20200101 2020-01-01 05:00:01 2020-01-01 05:01:01 2020-01-01 05:02:01 2020-01-01 05:04:01 ETI EPI AUSCULATION CPR
This is what I'm trying to achieve:
w = c("X20200101",
"X20200101",
"X20200101",
"X20200101")
x = c("2020-01-01 05:00:01",
"2020-01-01 05:01:01",
"2020-01-01 05:02:01",
"2020-01-01 05:04:01")
y = c("ETI",
"EPI",
"AUSCULTATION",
"CPR")
df_need <- as.data.frame(cbind(w, x, y))
w x y
1 X20200101 2020-01-01 05:00:01 ETI
2 X20200101 2020-01-01 05:01:01 EPI
3 X20200101 2020-01-01 05:02:01 AUSCULTATION
4 X20200101 2020-01-01 05:04:01 CPR
The approach that I have taken so far looks like this:
fake_df <- df_wide %>%
pivot_longer(cols = starts_with("x"),
names_to = "x_t",
values_to = "x_time",
values_drop_na = T) %>%
pivot_longer(cols = starts_with("y"),
names_to = "y_p",
values_to = "y_proc",
values_drop_na = T)
The above code then produces:
> fake_df
# A tibble: 16 x 5
w_1 x_t x_time y_p y_proc
<fct> <chr> <fct> <chr> <fct>
1 X20200101 x_1 2020-01-01 05:00:01 y_1 ETI
2 X20200101 x_1 2020-01-01 05:00:01 y_2 EPI
3 X20200101 x_1 2020-01-01 05:00:01 y_3 AUSCULATION
4 X20200101 x_1 2020-01-01 05:00:01 y_4 CPR
5 X20200101 x_2 2020-01-01 05:01:01 y_1 ETI
6 X20200101 x_2 2020-01-01 05:01:01 y_2 EPI
7 X20200101 x_2 2020-01-01 05:01:01 y_3 AUSCULATION
8 X20200101 x_2 2020-01-01 05:01:01 y_4 CPR
9 X20200101 x_3 2020-01-01 05:02:01 y_1 ETI
10 X20200101 x_3 2020-01-01 05:02:01 y_2 EPI
11 X20200101 x_3 2020-01-01 05:02:01 y_3 AUSCULATION
12 X20200101 x_3 2020-01-01 05:02:01 y_4 CPR
13 X20200101 x_4 2020-01-01 05:04:01 y_1 ETI
14 X20200101 x_4 2020-01-01 05:04:01 y_2 EPI
15 X20200101 x_4 2020-01-01 05:04:01 y_3 AUSCULATION
16 X20200101 x_4 2020-01-01 05:04:01 y_4 CPR
When I do this though, there data doesn't seem to come out right.