3

Let's consider the following data

data <- data.frame(V1 = c("A","A","A","B","B","C","C"), V2 = c("B","B","B","C","C","D","D"))
> data
  V1 V2
1  A  B
2  A  B
3  A  B
4  B  C
5  B  C
6  C  D
7  C  D

Now we aggregate data by both columns and obtain

library(dplyr)
group_by(data, V1, V2) %>% summarise(n())
      V1     V2   n()
  (fctr) (fctr) (int)
1      A      B     3
2      B      C     2
3      C      D     2

Now we want to turn this data back into original data. Is there any function for this procedure?

Andrej
  • 3,719
  • 11
  • 44
  • 73
  • Please considering renaming this question. You are 'reversing' the dplyr::summarise function only for one very specific application of summarise (that is, summarise(n())). The answers will not apply to other uses of the summarise function and could confuse/disappoint people who might come to this page – Nelson Auner Apr 06 '16 at 17:33

1 Answers1

4

We can use base R to do this

 data1 <- as.data.frame(data1)
 data1[rep(1:nrow(data1), data1[,3]),-3]

This is one of the cases where I would opt for base R. Having said that, there are package solutions for this type of problem, i.e. expandRows (a wrapper for the above) from splitstackshape

library(splitstackshape)
data %>%
     group_by(V1, V2) %>%
     summarise(n=n())  %>%
     expandRows(., "n")

Or if we want to stick to a similar option as in base R within %>%

 data %>% 
    group_by(V1, V2) %>%
    summarise(n=n()) %>%
    do(data.frame(.[rep(1:nrow(.), .$n),-3]))
#       V1     V2
#     (fctr) (fctr)
#1      A      B
#2      A      B
#3      A      B
#4      B      C
#5      B      C
#6      C      D
#7      C      D

data

data1 <- group_by(data, V1, V2) %>% summarise(n())
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662