14

I'm trying to figure out how to alter the way in which tidyr's pivot_wider() function creates new variable names in resulting wide data sets. Specifically, I would like the "names_from" variable to be added to the prefix of the new variables rather than the suffix.

My data set looks like:

list(ID = c("A950", "A950", "A950", "A970", "A970", "A970", "A996", "A996", "A996"), 
Phase = c("P1", "P2", "P3", "P1", "P2", "P3", "P1", "P2", "P3"), 
A = c(23.5, 25.2, 21.9, 21.9, 21.1, 20.3, 19.5, 18.7, 17.9), 
B = c(21.9, 21.1, 20.3, 19.5, 18.7, 17.9, 17.1, 16.3, 15.5), 
C = c(25.2, 21.9, 20.3, 17.6, 15.1, 12.7, 10.3, 7.8, 5.4), 
D = c("M", "M", "M", "F", "F", "F", "N", "N", "N"))

When I use pivot_wider() to spread the data set using Phase as the "key", my results look like:

ex_wide <- ex_long %>%
  pivot_wider(names_from = Phase, values_from = c(3:6))

list(ID = c("A950", "A970", "A996"), 
A_P1 = c(23.5, 21.9, 19.5), 
A_P2 = c(25.2, 21.1, 18.7), 
A_P3 = c(21.9, 20.3, 17.9), 
B_P1 = c(21.9, 19.5, 17.1), 
B_P2 = c(21.1, 18.7, 16.3), 
B_P3 = c(20.3, 17.9, 15.5), 
C_P1 = c(25.2, 17.6, 10.3), 
C_P2 = c(21.9, 15.1, 7.8), 
C_P3 = c(20.3, 12.7, 5.4), 
D_P1 = c("M", "F", "N"), 
D_P2 = c("M", "F", "N"), 
D_P3 = c("M", "F", "N"))

What I would like is for the column names to look like P1_A rather than A_P1 (i.e., phase_variable rather than variable_phase).

This seems like it would be simple to figure out; however, I have not been able to find any solutions that fit my needs. Any help would be greatly appreciated. Thanks in advance.

mkpcr
  • 431
  • 1
  • 3
  • 13

1 Answers1

21

You can use names_glue argument for this:

ex_wide <- ex_long %>%
  pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")

You simply pass Phase name and .value from specified columns with a _ separator.

Result

library(dplyr)
library(tidyr)

ex_wide <- ex_long %>%
  pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")

ex_wide
#> # A tibble: 3 x 13
#>   ID     P1_A  P2_A  P3_A  P1_B  P2_B  P3_B  P1_C  P2_C  P3_C P1_D  P2_D  P3_D 
#>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 A950   23.5  25.2  21.9  21.9  21.1  20.3  25.2  21.9  20.3 M     M     M    
#> 2 A970   21.9  21.1  20.3  19.5  18.7  17.9  17.6  15.1  12.7 F     F     F    
#> 3 A996   19.5  18.7  17.9  17.1  16.3  15.5  10.3   7.8   5.4 N     N     N

Data

ex_long <- structure(list(ID = c("A950", "A950", "A950", "A970", "A970", 
"A970", "A996", "A996", "A996"), Phase = c("P1", "P2", "P3", 
"P1", "P2", "P3", "P1", "P2", "P3"), A = c(23.5, 25.2, 21.9, 
21.9, 21.1, 20.3, 19.5, 18.7, 17.9), B = c(21.9, 21.1, 20.3, 
19.5, 18.7, 17.9, 17.1, 16.3, 15.5), C = c(25.2, 21.9, 20.3, 
17.6, 15.1, 12.7, 10.3, 7.8, 5.4), D = c("M", "M", "M", "F", 
"F", "F", "N", "N", "N")), class = "data.frame", row.names = c(NA, 
-9L))