3

I have a data frame df with a fields count and value and want to transform it to a data frame with the column value where each value of value is repeated count times.

I actually don't have an idea how to do this other than in a loop. Solutions involving plyr or reshape2 (or both) are perfectly acceptable.

Here is an example of what I am looking for:

count value
2     10
1     20

to

value
10
10
20

Follow-up question

What if I had 3 fields value1, value2, value3 which had to be repeated based on count?

Peteris
  • 3,548
  • 4
  • 28
  • 44
  • 3
    @Tyler's answer will work for your follow-up question if you omit the `2` like this: `data.frame(dat[rep(seq_len(dim(dat)[1]), dat$count), , drop = FALSE], row.names=NULL)` – GSee Aug 03 '13 at 01:35
  • 2
    Please get in the habit of providing code to reproduce your data.frames instead of just showing the output, or describing them. – GSee Aug 03 '13 at 01:37

2 Answers2

14

If your dataframe was named dat this would work:

dat[rep(seq_len(dim(dat)[1]), dat$count), 2]

## [1] 10 10 20

If you want it as a dataframe just like you posted:

data.frame(dat[rep(seq_len(dim(dat)[1]), dat$count), 2, drop = FALSE], row.names=NULL)

##   value
## 1    10
## 2    10
## 3    20
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 1
    This solution is better than the `mapply` solution I offered in that it returns a data frame, as the question specified, and it can handle the follow-up question of multiple value columns, as per the comment left by @GSee – Jota Aug 03 '13 at 04:26
  • 2
    Since this seems to be a somewhat common question, [here's a function for it](https://github.com/mrdwab/mrdwabmisc/blob/master/R/expandrows.R) – A5C1D2H2I1M1N2O1R2T1 Aug 03 '13 at 04:51
3

Here is an mapply solution assuming your data frame is called dat:

do.call("c", (mapply(rep, dat$value, dat$count)))

If you have multiple value columns, you could try

v <- do.call("c", (mapply(rep, c(dat$value1, dat$value2, dat$value3), dat$count)))

t(matrix(v, numberofvaluecolumns, byrow=T))

numberofvaluecolumns is just that, the number of value columns you are using. This returns a matrix, though. So you would have to be careful if matrices are problematic.

Jota
  • 17,281
  • 7
  • 63
  • 93