Pivot longer with multiple variables in columns

Question

My data looks like this:

# A tibble: 120 x 5
     age death_rate_male life_exp_male death_rate_fem life_exp_fem
   <dbl>           <dbl>         <dbl>          <dbl>        <dbl>
 1     0        0.00630           76.0       0.00523          81.0
 2     1        0.000426          75.4       0.000342         80.4
 3     2        0.00029           74.5       0.000209         79.4
 4     3        0.000229          73.5       0.000162         78.4
 5     4        0.000162          72.5       0.000143         77.4
 6     5        0.000146          71.5       0.000125         76.5
 7     6        0.000136          70.5       0.000113         75.5
 8     7        0.000127          69.6       0.000104         74.5
 9     8        0.000115          68.6       0.000097         73.5
10     9        0.000103          67.6       0.000093         72.5
# ... with 110 more rows
>

I'm trying to create a tidy table where the variables are age, gender, life expectancy, and death rate.

I managed to do this by splitting the data frame into two (one containing life expectancy, the other death rate), tidying both with pivot_longer(), and then appending the two tables. Is there a way to do this more elegantly, with a single pivot_longer() command? Thank you in advance.

Does this answer your question? [Gather multiple sets of columns](https://stackoverflow.com/questions/25925556/gather-multiple-sets-of-columns) — camille, Jul 29 '21 at 19:53

akrun · Answer 1 · 2020-09-19T20:44:19.157

We can use names_pattern (where we capture as a group based on the pattern)

library(dplyr)
library(tidyr)
df1 %>%
   pivot_longer(cols = -age, names_to = c( '.value', 'grp'), 
         names_pattern = "^(\\w+_\\w+)_(\\w+)")
# A tibble: 20 x 4
#     age grp   death_rate life_exp
#   <int> <chr>      <dbl>    <dbl>
# 1     0 male    0.0063       76  
# 2     0 fem     0.00523      81  
# 3     1 male    0.000426     75.4
# 4     1 fem     0.000342     80.4
# 5     2 male    0.00029      74.5
# 6     2 fem     0.000209     79.4
# 7     3 male    0.000229     73.5
# 8     3 fem     0.000162     78.4
# 9     4 male    0.000162     72.5
#10     4 fem     0.000143     77.4
#11     5 male    0.000146     71.5
#12     5 fem     0.000125     76.5
#13     6 male    0.000136     70.5
#14     6 fem     0.000113     75.5
#15     7 male    0.000127     69.6
#16     7 fem     0.000104     74.5
#17     8 male    0.000115     68.6
#18     8 fem     0.000097     73.5
#19     9 male    0.000103     67.6
#20     9 fem     0.000093     72.5

or names_sep (specify the pattern here it is underscore followed by no character that is an underscore until the end)

df1 %>%
   pivot_longer(cols = -age, names_to = c( '.value', 'grp'), 
        names_sep = "_(?=[^_]+$)")

data

df1 <- structure(list(age = 0:9, death_rate_male = c(0.0063, 0.000426, 
0.00029, 0.000229, 0.000162, 0.000146, 0.000136, 0.000127, 0.000115, 
0.000103), life_exp_male = c(76, 75.4, 74.5, 73.5, 72.5, 71.5, 
70.5, 69.6, 68.6, 67.6), death_rate_fem = c(0.00523, 0.000342, 
0.000209, 0.000162, 0.000143, 0.000125, 0.000113, 0.000104, 9.7e-05, 
9.3e-05), life_exp_fem = c(81, 80.4, 79.4, 78.4, 77.4, 76.5, 
75.5, 74.5, 73.5, 72.5)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

The library statement should be for tidyr. – G. Grothendieck Sep 19 '20 at 20:43 — G. Grothendieck, Sep 19 '20 at 20:43

score 1 · Answer 2 · answered Sep 19 '20 at 21:20

Borrowing the data from akrun, here is a base R option using reshape

reshape(
  setNames(df, gsub("(.*)_(\\w+)", "\\1\\.\\2", names(df))),
  direction = "long",
  varying = -1
)

such that

        age time death_rate life_exp id
1.male    0 male   0.006300     76.0  1
2.male    1 male   0.000426     75.4  2
3.male    2 male   0.000290     74.5  3
4.male    3 male   0.000229     73.5  4
5.male    4 male   0.000162     72.5  5
6.male    5 male   0.000146     71.5  6
7.male    6 male   0.000136     70.5  7
8.male    7 male   0.000127     69.6  8
9.male    8 male   0.000115     68.6  9
10.male   9 male   0.000103     67.6 10
1.fem     0  fem   0.005230     81.0  1
2.fem     1  fem   0.000342     80.4  2
3.fem     2  fem   0.000209     79.4  3
4.fem     3  fem   0.000162     78.4  4
5.fem     4  fem   0.000143     77.4  5
6.fem     5  fem   0.000125     76.5  6
7.fem     6  fem   0.000113     75.5  7
8.fem     7  fem   0.000104     74.5  8
9.fem     8  fem   0.000097     73.5  9
10.fem    9  fem   0.000093     72.5 10

Pivot longer with multiple variables in columns

2 Answers2

data

Linked