Reshaping count-summarised data into long form in R

Question

Embarrassingly basic question, but if you don't know.. I need to reshape a data.frame of count summarised data into what it would've looked like before being summarised. This is essentially the reverse of {plyr} count() e.g.

> (d = data.frame(value=c(1,1,1,2,3,3), cat=c('A','A','A','A','B','B')))
  value cat
1     1   A
2     1   A
3     1   A
4     2   A
5     3   B
6     3   B
> (summry = plyr::count(d))
  value cat freq
1     1   A    3
2     2   A    1
3     3   B    2

If you start with summry what is the quickest way back to d? Unless I'm mistaken (very possible), {Reshape2} doesn't do this..

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2014-09-04T15:28:30.677

2

Just use rep:

summry[rep(rownames(summry), summry$freq), c("value", "cat")]
#     value cat
# 1       1   A
# 1.1     1   A
# 1.2     1   A
# 2       2   A
# 3       3   B
# 3.1     3   B

A variation of this approach can be found in expandRows from my "SOfun" package. If you had that loaded, you would be able to simply do:

expandRows(summry, "freq")

edited Sep 04 '14 at 15:28

answered Sep 04 '14 at 15:18

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

I didn't know `rep` accepts a vector, many thanks! – geotheory Sep 04 '14 at 15:35
`SOfun` looks jolly useful btw – geotheory Sep 04 '14 at 15:41
+1 for the SOfun reference ! – Henk Sep 04 '14 at 16:04
@geotheory, Glad to help, and have fun with the package :-) – A5C1D2H2I1M1N2O1R2T1 Sep 04 '14 at 19:04
@Henk, thanks. I hope you find something useful in there! – A5C1D2H2I1M1N2O1R2T1 Sep 04 '14 at 19:05

cdeterman · Answer 2 · 2014-09-04T15:35:21.453

There is a good table to dataframe function on the R cookbook website that you can modify slightly. The only modifications were changing 'Freq' -> 'freq' (to be consistent with plyr::count) and making sure the rownames were reset as increasing integers.

expand.dft <- function(x, na.strings = "NA", as.is = FALSE, dec = ".") {
  # Take each row in the source data frame table and replicate it
  # using the Freq value
  DF <- sapply(1:nrow(x), 
               function(i) x[rep(i, each = x$freq[i]), ],
               simplify = FALSE)

  # Take the above list and rbind it to create a single DF
  # Also subset the result to eliminate the Freq column
  DF <- subset(do.call("rbind", DF), select = -freq)

  # Now apply type.convert to the character coerced factor columns  
  # to facilitate data type selection for each column 
  for (i in 1:ncol(DF)) {
    DF[[i]] <- type.convert(as.character(DF[[i]]),
                            na.strings = na.strings,
                            as.is = as.is, dec = dec)
  }
  row.names(DF) <- seq(nrow(DF))
  DF
}

expand.dft(summry)

  value cat
1     1   A
2     1   A
3     1   A
4     2   A
5     3   B
6     3   B

Reshaping count-summarised data into long form in R

2 Answers2