Tidy data: create row for each individual, based on 'count' variable

Question

I have a dataframe which is formatted very much like an example dataframe df1 given below. There are three columns: two categorical variables and a 'Count' column specifying the amount of objects with that specific combination.

I want to move this data frame towards the format shown in example dataframe df2. Instead of a 'Count' column, each object is simply given on a seperate line.

I have tried things with the dplyr and tidyr packages but I am not yet very well-versed in R. What would be a good way to perform the function I want?

set.seed(1)
x1 <- c("Pants", "Shoes", "Scarf")
x2 <- c("Ugly", "Beautiful")
x3 <- sample(1:10, size=6, replace=T)

df1 <- data.frame(Object=rep(x1, 2),
                  Quality=rep(x2, each=3),
                  Count=x3);
df1; sum(df1[,3])

df2 <- data.frame(Object=c(rep("Pants", 3), rep("Shoes", 4), rep("Scarf", 6), 
                           rep("Pants", 10), rep("Shoes", 3), rep("Scarf", 9)),
                  Quality=c(rep("Ugly", 3), rep("Ugly", 4), rep("Ugly", 6), 
                            rep("Beautiful", 10), rep("Beautiful", 3), 
                            rep("Beautiful", 9))
                 )
head(df2); tail(df2)

You could use `base R` i..e `df1[rep(1:nrow(df1), df1$Count),-3]` — akrun, Apr 19 '15 at 09:39
Thank you! This is what i was looking for, though I prefer Ananda's solution as it is easier to read the code that way. — Maarten, Apr 19 '15 at 10:06
Of course, @akrun's suggestion is almost exactly what is in `expandRows`, but `expandRows` has been created to be a little bit more general. — A5C1D2H2I1M1N2O1R2T1, Apr 19 '15 at 13:51

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2015-04-19T09:35:25.727

If you want to consider other packages, you can try expandRows from my "splitstackshape" package.

Usage would be:

> library(splitstackshape)
> df2 <- expandRows(df1, "Count")

> head(df2)
    Object Quality
1    Pants    Ugly
1.1  Pants    Ugly
1.2  Pants    Ugly
2    Shoes    Ugly
2.1  Shoes    Ugly
2.2  Shoes    Ugly
> tail(df2)
    Object   Quality
6.3  Scarf Beautiful
6.4  Scarf Beautiful
6.5  Scarf Beautiful
6.6  Scarf Beautiful
6.7  Scarf Beautiful
6.8  Scarf Beautiful
> nrow(expandRows(df1, "Count"))
[1] 35

Tidy data: create row for each individual, based on 'count' variable

1 Answers1