Using pivot longer to collapse data from multiple columns

Question

I've read through a number of similar posts and tutorials but am really struggling to understand the solution to my issue. I have a dataset that is wide, and when I make it longer - I want to collapse two sets of data (both duration and results).

For each participant (id), there is a category, and then a series of blood test results. Each test has both duration (in days) and a result (numeric value).

Here's how it looks now:

id	category	duration_1	results_1	duration_2	results_2	duration_3	results_3
01	diabetic	58	32	65	56	76	87
02	prediabetic	54	32	65	25	76	35
03	unknown	46	65	65	56	21	67

How I'd like it to be is:

id	category	duration	results
01	diabetic	58	32
01	diabetic	65	56
01	diabetic	76	87
02	prediabetic	54	32
02	prediabetic	65	25
02	prediabetic	76	35
03	unknown	46	65
03	unknown	65	25
03	unknown	21	67

I can get pivot longer to work for "results" - but I can't get it to pivot on both "results" and "duration".

Any assistance would be greatly appreciated. I'm still fairly new to R. Thanks!

Not sure if there is a master duplicate, but there's a few out there, including this one: https://stackoverflow.com/questions/61940984/using-pivot-longer-with-multiple-paired-columns-in-the-wide-dataset — thelatemail, Dec 17 '21 at 01:34
Does this answer your question? [Reshaping multiple sets of measurement columns (wide format) into single columns (long format)](https://stackoverflow.com/questions/12466493/reshaping-multiple-sets-of-measurement-columns-wide-format-into-single-columns) — camille, Dec 17 '21 at 01:56
Also https://stackoverflow.com/q/25925556/5325862, https://stackoverflow.com/q/28729506/5325862, and https://stackoverflow.com/q/47272371/5325862 — camille, Dec 17 '21 at 01:59

AndrewGB · Answer 1 · 2021-12-17T00:56:58.820

3

One way is to separate the column names into two columns while you pivot (hence the names_sep below). Then, you can just drop the column number.

library(tidyverse)

df %>%
  tidyr::pivot_longer(!c(id, category),
                      names_to = c(".value", "num"),
                      names_sep = "_") %>%
  dplyr::select(-num)

Output

# A tibble: 9 × 4
  id    category    duration results
  <chr> <chr>          <dbl>   <dbl>
1 01    diabetic          32      23
2 01    diabetic          87      67
3 01    diabetic          98      78
4 02    prediabetic       43      45
5 02    prediabetic       34      65
6 02    prediabetic       12      12
7 03    unknown           32      54
8 03    unknown           75      45
9 03    unknown           43      34

Data

df <-
  structure(
    list(
      id = c("01", "02", "03"),
      category = c("diabetic", "prediabetic", "unknown"),
      duration_1 = c(32, 43, 32),
      results_1 = c(23, 45, 54),
      duration_2 = c(87, 34, 75),
      results_2 = c(67, 65, 45),
      duration_3 = c(98, 12, 43),
      results_3 = c(78, 12, 34)
    ),
    class = "data.frame",
    row.names = c(NA,-3L)
  )

edited Dec 17 '21 at 00:56

answered Dec 17 '21 at 00:50

AndrewGB

16,126
5
18
49

why use `dplyr::` etc? Youve already loaded `tidyverse` – Onyambu Dec 17 '21 at 01:07
also note `pivot_longer(df, -c(id, category), names_to = ".value",names_pattern = '([^_]+)')` – Onyambu Dec 17 '21 at 01:09
@Onyambu As far as `dplyr::`, I just prefer (in general) to always be explicit. Great, thanks for that suggestion on the names pattern. That's a really clever way to handle it! – AndrewGB Dec 17 '21 at 04:57
1

Its okay, though you do not need it once you have loaded the libraries. Thats the whole purpose of loading the library. eg you are not using `magritrr::\`%>%\`` – Onyambu Dec 17 '21 at 05:16
1

Thanks that worked well. Much appreciated! – Tori Russell Dec 17 '21 at 15:48

Using pivot longer to collapse data from multiple columns

1 Answers1