0

I want to perform the inverse of separate_rows(). i.e.:

# Create example data
data <- data.frame(x1 = c(1,1,3,3,5),
                   x2 = c('A','A','C','C','E'),
                   x3 = 6:10)
data

Which results in

  x1 x2 x3
1  1  A  6
2  1  A  7
3  3  C  8
4  3  C  9
5  5  E 10

Somehow I expected a unite_rows() function in tidyr or dplyr to do this:

  x1 x2 x3
1  1  A  6,7
3  3  C  8,9
5  5  E 10

But I couldn't find any similar. Should I combine cels using unite()? (it seems a dirty way to go)

unite(data, x2, x3, col = "x3", sep = ",")
Veronica
  • 280
  • 2
  • 12

3 Answers3

1

Using dplyr's summarise to make a nested numerical vector (i.e, not turning the numbers into a character vector):

data %>% 
  group_by(x1, x2) %>% 
  summarise(x3 = list(x3))

# A tibble: 3 x 3
     x1 x2    x3       
  <dbl> <chr> <list>   
1     1 A     <int [2]>
2     3 C     <int [2]>
3     5 E     <int [1]>
Baraliuh
  • 593
  • 3
  • 12
0

Try this base R approach:

#Code
data <- aggregate(x3~x1+x2,data,function(x) paste0(x,collapse = ','))

Output:

  x1 x2  x3
1  1  A 6,7
2  3  C 8,9
3  5  E  10
Duck
  • 39,058
  • 13
  • 42
  • 84
  • It works!! Of course as I am learning R, I would love an explanation of what "x3 ~ x1 + x2" means! – Veronica Oct 01 '20 at 17:03
  • 1
    @Veronica Hi Vero. Sure. It is a formula. To the left of `~` you place the variable to be aggregated and to the right you add the variables you want to group. If you want more than one variable, you can use `+` to continue adding new variables. I hope that was clear! – Duck Oct 01 '20 at 17:06
0

You can try this way using dplyr, using unique when you working with large data and you have duplicated values.

library(dplyr)
data %>% 
  group_by(x2) %>% 
  mutate(x3 = paste0(unique(x3), collapse = ",")) %>% 
  slice(1) %>% 
  ungroup()
# x1 x2    x3   
# <dbl> <chr> <chr>
# 1     1 A     6,7  
# 2     3 C     8,9  
# 3     5 E     10 
Tho Vu
  • 1,304
  • 2
  • 8
  • 20