How to concatenate two pairs of columns by name with shifting rows, in a dataframe with multiple column pairs

Question

I have this dataframe:

     id    a1    a2    b1    b2    c1    c2
  <int> <int> <int> <int> <int> <int> <int>
1     1    83    33    55    33    85    86
2     2    37     0    60    98    51     0
3     3    97    71    85     8    44    40
4     4    51     6    43    15    55    57
5     5    28    53    62    73    70     9

df <- structure(list(id = 1:5, a1 = c(83L, 37L, 97L, 51L, 28L), a2 = c(33L, 
0L, 71L, 6L, 53L), b1 = c(55L, 60L, 85L, 43L, 62L), b2 = c(33L, 
98L, 8L, 15L, 73L), c1 = c(85L, 51L, 44L, 55L, 70L), c2 = c(86L, 
0L, 40L, 57L, 9L)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

I want to: Combine columns with same starting character to one column by shifting each row of the second column by 1 down and naming the new column with the character of the two columns.

My desired output:

      id     a     b     c
   <dbl> <dbl> <dbl> <dbl>
 1     1    83    55    85
 2     1    33    33    86
 3     2    37    60    51
 4     2     0    98     0
 5     3    97    85    44
 6     3    71     8    40
 7     4    51    43    55
 8     4     6    15    57
 9     5    28    62    70
10     5    53    73     9

I have tried using lagfunction but I don`t know how to combine and shift columns at the same time!

To clarify a picture:

Do you only have two versions of each column name 1 and 2? – TimTeaFan Jul 28 '21 at 15:56 — TimTeaFan, Jul 28 '21 at 15:56

Anoushiravan R · Accepted Answer · 2021-07-28T16:09:53.640

5

You can use the following solution. I also have modified your data set an added an id column:

library(tidyr)

df %>%
  pivot_longer(!id, names_to = c(".value", NA), names_pattern = "([[:alpha:]])(\\d)")

# A tibble: 10 x 4
      id     a     b     c
   <int> <int> <int> <int>
 1     1    83    55    85
 2     1    33    33    86
 3     2    37    60    51
 4     2     0    98     0
 5     3    97    85    44
 6     3    71     8    40
 7     4    51    43    55
 8     4     6    15    57
 9     5    28    62    70
10     5    53    73     9

edited Jul 28 '21 at 16:09

answered Jul 28 '21 at 16:03

Anoushiravan R

21,622
3
18
41

This drops the `id` column. – TimTeaFan Jul 28 '21 at 16:04
1

It's weird I did not have any `id` column from the beginning. Check it now please. I added `id` variable myself. – Anoushiravan R Jul 28 '21 at 16:07
1

Thank you. I have to learn still so much! Now it is clear for me!!! But the answers here helped me a lot! – TarJae Jul 28 '21 at 17:03
You're welcome we are all learning. – Anoushiravan R Jul 28 '21 at 17:04

GuedesBF · Answer 2 · 2021-07-28T16:12:33.920

2

We can pivot_longer, remove the digits from name, then pivot_wider and unnest

library(stringr)
library(dplyr)
library(tidyr)

df %>% pivot_longer(cols = -id)%>%
        mutate(name=str_remove(name, '[0-9]'))%>%
        pivot_wider(names_from = name)%>%
        unnest(everything())

# A tibble: 10 x 4
      id     a     b     c
   <int> <int> <int> <int>
 1     1    83    55    85
 2     1    33    33    86
 3     2    37    60    51
 4     2     0    98     0
 5     3    97    85    44
 6     3    71     8    40
 7     4    51    43    55
 8     4     6    15    57
 9     5    28    62    70
10     5    53    73     9

edited Jul 28 '21 at 16:12

answered Jul 28 '21 at 16:06

GuedesBF

8,409
5
19
37

1

Bravo my friend. – Anoushiravan R Jul 28 '21 at 16:22
1

thank you anoush. I still struggle quite a lot with pivoting issues (so many parameters to pivot_* ). Yours is definitely the neat way to go. – GuedesBF Jul 28 '21 at 16:27
1

It's ok. I once asked a similar question a couple months ago. Today I used it. I mean it's all about experience in my opinion and from now on you could use all that is mentioned here for future questions. However, your solution is very good too. – Anoushiravan R Jul 28 '21 at 16:30

score 2 · Answer 3 · answered Jul 28 '21 at 16:19

Doing it as a pivot_longer(), then pivot_wider() is easier to read, but @Anoushiravan R's answer to more direct

library(tidyverse)

df %>% 
  rownames_to_column(var = "id") %>% # Add the id column
  pivot_longer(-id) %>% # Make long
  mutate(order = str_sub(name, -1), name = str_sub(name, 1, 1)) %>% # Breakout the name column
  pivot_wider(names_from = name) %>% # Make wide again
  select(-order) # Drop the ordering column

score 2 · Answer 4 · answered Jul 28 '21 at 16:20

I think ANoushiravan's solution is the tidiest way to do it. We could also use {dplyover} (disclaimer) for this:

library(dplyr)
library(dplyover) # https://github.com/TimTeaFan/dplyover

df %>% 
  group_by(id) %>% 
  summarise(across2(ends_with("1"),
                    ends_with("2"),
                    ~ c(.x,.y),
                    .names = "{pre}"),
            )
#> `summarise()` has grouped output by 'id'. You can override using the `.groups` argument.

#> # A tibble: 10 x 4
#> # Groups:   id [5]
#>       id     a     b     c
#>    <int> <int> <int> <int>
#>  1     1    83    55    85
#>  2     1    33    33    86
#>  3     2    37    60    51
#>  4     2     0    98     0
#>  5     3    97    85    44
#>  6     3    71     8    40
#>  7     4    51    43    55
#>  8     4     6    15    57
#>  9     5    28    62    70
#> 10     5    53    73     9

^{Created on 2021-07-28 by the reprex package (v0.3.0)}

How to concatenate two pairs of columns by name with shifting rows, in a dataframe with multiple column pairs

4 Answers4