-1

I have a wide data frame friend where I'd like to perform pivot_longer() over multiple sets of columns. A minimal example of the data frame is provided below:

id <- c(303, 303)
year <- c(2020, 2020) 
city_a <- c("Madrid", "Madrid") 
PA1 = c("AA", "AA") 
a1_x <- c(475, 457) 
a1_y = c(576, 576) 
PA2 = c("BB", "BB") 
a2_x = c(746, 756) 
a2_y = c(465, 475)
PA3 = c("CC", "CC") 
a3_x = c(546, 756) 
a3_y = c(574, 867) 
PA4 = c("DD", "DD") 
a4_x = c(463, 875) 
a4_y = c(565, 576) 
PA5 = c("EE", "EE") 
a5_x = c(564, 746) 
a5_y= c(576, 576) 
city_h = c("Chicago", "Chicago") 
PH1 = c("FF", "FF") 
h1_x = c(475, 475) 
h1_y = c(576, 745) 
PH2 = c("HH", "HH") 
h2_x = c(746, 475) 
h2_y = c(465, 465) 
PH3 = c("JJ", "JJ")
h3_x = c(546, 475) 
h3_y = c(574, 475) 
PH4 = c("KK", "KK") 
h4_x = c(463, 756) 
h4_y = c(565, 586) 
PH5 = c("MM", "MM")
h5_x = c(564, 456) 
h5_y = c(576, 586) 
vue_x = c(365, 465) 
vue_y = c(846, 475) 
vue_z = c(465, 845)


data <- data.frame(id, year, city_a, PA1, a1_x, a1_y, PA2, a2_x, a2_y, PA3, a3_x, a3_y, PA4, a4_x, a4_y, PA5, a5_x, a5_y, city_h, PH1, h1_x, h1_y, PH2, h2_x, h2_y, PH3, h3_x, h3_y, PH4, h4_x, h4_y, PH5, h5_x, h5_y, vue_x, vue_y, vue_z)

I'd like to modify the final data frame to have the following format:

id
year
City: gathers columns city_a and city_h Person: gathers columns starting with PA or PH
x: gathers columns with pattern _x
y: gathers columns with pattern _y
z: shows data deom _z column

Hopefully the explanation makes sense.I've been trying examples modifying the solution here https://community.rstudio.com/t/pivot-longer-on-multiple-column-sets-pairs/43958/7 but haven't been successful.

Jdv
  • 329
  • 1
  • 10
  • 2
    Could please show how your expected output should look like based on the data you provided? – Martin Gal Aug 10 '20 at 13:30
  • I guess `PH5, h5_x, h5_y` belong together and they belong to `chicago` where `PA1, a1_x, a1_y` belong together to `madrid`. But what about `vue_*`? And how are the data separated? – Martin Gal Aug 10 '20 at 14:07
  • 1
    We definitely need to see the desired output to provide help. But I would suggest that this is probably a case where you pivot_longer everything except id and year, separate the columns into the headers you want, and then pivot_wider to get where you want to be. – Adam Sampson Aug 10 '20 at 14:10
  • Hi Martin, thank you. Yes like you said those belong together. Maybe rather than having x and y columns a better idea would be to have a coordinate column where we could have the x and y values, as well as the x,y, and z values for vue – Jdv Aug 10 '20 at 14:12
  • Please show your expected output. I'm not sure, how it should look like. – Martin Gal Aug 10 '20 at 14:53
  • Martin, I indicated it above. The columns I'd like are (id, year, city, person, x, y, z) – Jdv Aug 10 '20 at 15:40
  • 1
    You haven't fully indicated. You have PA1 and PH1 in a row. Do those get separated on to different rows now so that each id has rows for every PA or PH? How about transforming your data manually into what you want to see and editing your original post instead of assuming we can understand what you want after we have told you that we can't. – Adam Sampson Aug 10 '20 at 19:10

1 Answers1

-1

As people told you in the comment, your expected output is not that clear.

Here is what I understood of your problem. If it is not what you are expecting, I hope this will at least make you understand what is unclear in your question.

data %>% 
    pivot_longer(starts_with("city"), 
                 names_to="city_name", values_to="city") %>%
    pivot_longer(c(starts_with("PA"), starts_with("PH")), 
                 names_to="person_name", values_to="person") %>%
    pivot_longer(ends_with("_x"), 
                 names_to="x_name", values_to="x") %>%
    pivot_longer(ends_with("_y"), 
                 names_to="y_name", values_to="y") %>%
    pivot_longer(ends_with("_z"), 
                 names_to="z_name", values_to="z") %>%
    # select(-ends_with("_name")) %>% #uncomment this line to remove the names for good
    identity()

Which gives:

# # A tibble: 4,840 x 12
#       id  year city_name city   person_name person x_name     x y_name     y z_name     z
#    <dbl> <dbl> <chr>     <fct>  <chr>       <fct>  <chr>  <dbl> <chr>  <dbl> <chr>  <dbl>
# 1    303  2020 city_a    Madrid PA1         AA     a1_x     475 a1_y     576 vue_z    465
# 2    303  2020 city_a    Madrid PA1         AA     a1_x     475 a2_y     465 vue_z    465
# 3    303  2020 city_a    Madrid PA1         AA     a1_x     475 a3_y     574 vue_z    465
# 4    303  2020 city_a    Madrid PA1         AA     a1_x     475 a4_y     565 vue_z    465
# 5    303  2020 city_a    Madrid PA1         AA     a1_x     475 a5_y     576 vue_z    465
# 6    303  2020 city_a    Madrid PA1         AA     a1_x     475 h1_y     576 vue_z    465
# 7    303  2020 city_a    Madrid PA1         AA     a1_x     475 h2_y     465 vue_z    465
# 8    303  2020 city_a    Madrid PA1         AA     a1_x     475 h3_y     574 vue_z    465
# 9    303  2020 city_a    Madrid PA1         AA     a1_x     475 h4_y     565 vue_z    465
# 10   303  2020 city_a    Madrid PA1         AA     a1_x     475 h5_y     576 vue_z    465
# # ... with 4,830 more rows
Dan Chaltiel
  • 7,811
  • 5
  • 47
  • 92