Collapse cells across several rows to a single column in a R data frame

Question

I want to perform the inverse of separate_rows(). i.e.:

# Create example data
data <- data.frame(x1 = c(1,1,3,3,5),
                   x2 = c('A','A','C','C','E'),
                   x3 = 6:10)
data

Which results in

Somehow I expected a unite_rows() function in tidyr or dplyr to do this:

  x1 x2 x3
1  1  A  6,7
3  3  C  8,9
5  5  E 10

But I couldn't find any similar. Should I combine cels using unite()? (it seems a dirty way to go)

unite(data, x2, x3, col = "x3", sep = ",")

What type should x3 be in the output? List or string? – s_baldur Oct 01 '20 at 15:51 — s_baldur, Oct 01 '20 at 15:51
Sorry I didn't make it clear, but it should be a string – Veronica Oct 01 '20 at 16:53 — Veronica, Oct 01 '20 at 16:53

score 1 · Answer 1 · answered Oct 01 '20 at 18:07

Using dplyr's summarise to make a nested numerical vector (i.e, not turning the numbers into a character vector):

data %>% 
  group_by(x1, x2) %>% 
  summarise(x3 = list(x3))

# A tibble: 3 x 3
     x1 x2    x3       
  <dbl> <chr> <list>   
1     1 A     <int [2]>
2     3 C     <int [2]>
3     5 E     <int [1]>

score 0 · Accepted Answer · answered Oct 01 '20 at 15:50

0

Try this base R approach:

#Code
data <- aggregate(x3~x1+x2,data,function(x) paste0(x,collapse = ','))

Output:

  x1 x2  x3
1  1  A 6,7
2  3  C 8,9
3  5  E  10

answered Oct 01 '20 at 15:50

Duck

39,058
13
42
84

It works!! Of course as I am learning R, I would love an explanation of what "x3 ~ x1 + x2" means! – Veronica Oct 01 '20 at 17:03
1

@Veronica Hi Vero. Sure. It is a formula. To the left of `~` you place the variable to be aggregated and to the right you add the variables you want to group. If you want more than one variable, you can use `+` to continue adding new variables. I hope that was clear! – Duck Oct 01 '20 at 17:06

score 0 · Answer 3 · answered Oct 01 '20 at 18:01

You can try this way using dplyr, using unique when you working with large data and you have duplicated values.

library(dplyr)
data %>% 
  group_by(x2) %>% 
  mutate(x3 = paste0(unique(x3), collapse = ",")) %>% 
  slice(1) %>% 
  ungroup()
# x1 x2    x3   
# <dbl> <chr> <chr>
# 1     1 A     6,7  
# 2     3 C     8,9  
# 3     5 E     10

Collapse cells across several rows to a single column in a R data frame

3 Answers3