How do I create a dataset with all entries in each row in R?

Question

Say I have a large dataset, and the information is organized based on a type of entry, and the amount of occurrences of that type of entry.

Say...

   Area        Animal                              Observations       
   US           Cat                                   4
   NE           Cat                                   9
   US           Dog                                   2

My question is how would I create a dataset (to do analysis in R) that would list the items like...

Say...

   Area        Animal      
    US            Cat
    US            Cat
    US            Cat...
    US
    NE
    NE
    NE
    NE....
    US..          Dog..

I'm asking because I have a large data set and I'm trying to get each entry for each row, rather them being grouped. Anyone know how to do this?

score 1 · Answer 1 · answered Jun 23 '15 at 21:18

1

Try

library(splitstackshape)
expandRows(df1, 'Observations')
#   Area Animal
#1     US    Cat
#1.1   US    Cat
#1.2   US    Cat
#1.3   US    Cat
#2     NE    Cat
#2.1   NE    Cat
#2.2   NE    Cat
#2.3   NE    Cat
#2.4   NE    Cat
#2.5   NE    Cat
#2.6   NE    Cat
#2.7   NE    Cat
#2.8   NE    Cat
#3     US    Dog
#3.1   US    Dog

answered Jun 23 '15 at 21:18

akrun

874,273
37
540
662

Incredibly! Thank you so much for being so quick! – Timothy Jun 23 '15 at 21:30

score 1 · Answer 2 · answered Jun 23 '15 at 21:30

Index the dataframe by 'rownames' repeated as many times as 'Observations':

> rep(rownames(dat), dat$Observations)
 [1] "1" "1" "1" "1" "2" "2" "2" "2" "2" "2" "2" "2" "2" "3" "3"

> dat[ rep(rownames(dat), dat$Observations) , ]
    Area Animal Observations
1     US    Cat            4
1.1   US    Cat            4
1.2   US    Cat            4
1.3   US    Cat            4
2     NE    Cat            9
2.1   NE    Cat            9
2.2   NE    Cat            9
2.3   NE    Cat            9
2.4   NE    Cat            9
2.5   NE    Cat            9
2.6   NE    Cat            9
2.7   NE    Cat            9
2.8   NE    Cat            9
3     US    Dog            2
3.1   US    Dog            2

score 1 · Answer 3 · answered Jun 23 '15 at 21:31

Here's an approach using lapply() and rep():

df <- data.frame(Area=c('US','NE','US'), Animal=c('Cat','Cat','Dog'), Observations=c(4,9,2) );
as.data.frame(lapply(df[-3],rep,df[,3]));
##    Area Animal
## 1    US    Cat
## 2    US    Cat
## 3    US    Cat
## 4    US    Cat
## 5    NE    Cat
## 6    NE    Cat
## 7    NE    Cat
## 8    NE    Cat
## 9    NE    Cat
## 10   NE    Cat
## 11   NE    Cat
## 12   NE    Cat
## 13   NE    Cat
## 14   US    Dog
## 15   US    Dog

How do I create a dataset with all entries in each row in R?

3 Answers3