0

I want to condense information in a dataframe to reduce the number of rows. Consider the dataframe:

df <- data.frame(id=c("A","A","A","B","B","C","C","C"),b=c(4,5,6,1,2,7,8,9))
df 
  id b
1 A 4
2 A 5
3 A 6
4 B 1
5 B 2
6 C 7
7 C 8
8 C 9

I want to collapse the dataframe to all unique values of "id" and list the values in variable b. The result should look like

df.results <- data.frame(id=c("A","B","C"),b=c("4,5,6","1,2","7,8,9"))
df.results
  id     b
1  A 4,5,6
2  B   1,2
3  C 7,8,9

A solution for the first step is:

library(dplyr)
df.results <- df %>%
  group_by(id) %>%
  summarise(b = toString(b)) %>%
  ungroup()

How would you turn df.results back into df?

Dominix
  • 433
  • 3
  • 9
  • What do you mean *turn df.results back into df*? – Sotos Jul 24 '19 at 12:42
  • Do the operation backwards to arrive at df when given df.results. Just can think of a for loop, e.g.: ```for (i in df.results$id) { df.list[i] <- strsplit(df.results$b[df.results$id==i], ",") } ``` – Dominix Jul 24 '19 at 13:39
  • 1
    Try `tidyr::separate_rows(data.frame(id = c("A", "B", "C"), b = c("4,5,6", "1,2", "7,8,9")), b, sep = ',')` – Sotos Jul 24 '19 at 13:40
  • 1
    that's beautiful. Thanks! – Dominix Jul 24 '19 at 13:43

0 Answers0