8

I have a data frame that looks like this:

index   ID   date              Amount
2       1001 2010-06-08         0
21      1001 2010-10-08        10
6       1002 2010-08-16        30
5       1002 2010-11-25        20
9       1003 2010-01-01         0
8       1003 2011-03-06        10
12      1004 2012-03-12        10
11      1004 2012-06-21        10
15      1005 2010-01-01        30
13      1005 2010-04-06        20

I want to subset this data so that I have new data frames, one for each ID like this

index   ID   date              Amount
2       1001 2010-06-08         0
21      1001 2010-10-08        10

And

6       1002 2010-08-16        30
5       1002 2010-11-25        20

And so on.

I don't need to save the new data frames, but use it to perform some basic calculations. Also I want to do this on my entire table consisting of more than 10000 IDs and hence the need for a loop. I tried this

    temp <- data.frame(Numb=c(),Dt=c(),Amt=c())
for (i in seq_along(stNew$ID)){
   temp[i,] <- subset(stNew, stNew[i,]==stNew$ID[i])
}

But that didn't work. Any suggestions?

M--
  • 25,431
  • 8
  • 61
  • 93
Bala Deshpande
  • 165
  • 1
  • 2
  • 8
  • 2
    Hi and welcome to SO! My spontaneous suggestion is that you should try to search SO (and elsewhere) for an answer. To perform something per group in a dataframe is one of the most commonly asked questions on SO, and you will surely find some nice answers you can adapt to your own data. [This](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) and [this](http://lamages.blogspot.se/2012/01/say-it-in-r-with-by-apply-and-friends.html) may get you started. Cheers. – Henrik Nov 09 '13 at 20:00
  • Henrik - thanks. As a matter of fact i did search through and found a couple which were extremely useful. Thanks for your links as well. – Bala Deshpande Nov 10 '13 at 22:42
  • Great! Thus, no need for splitting or subsetting your data frame. – Henrik Nov 10 '13 at 22:45

3 Answers3

13

Take a look at the list2env and split function. Hereby some examples using the iris dataset.

Two way:

list_df <- split(iris, iris$Species) #split the dataset into a list of datasets based on the value of iris$Species
list2env(list_DF, envir= .GlobalEnv) #split the list into separate datasets

One way:

list2env(split(iris, iris$Species), envir = .GlobalEnv)

Or you can assign custom names for the new datasets with a for loop:

iris_split <- split(iris, iris$Species)
new_names <- c("one", "two", "three")
for (i in 1:length(iris_split)) {
  assign(new_names[i], iris_split[[i]])
}

Updates with examples

Related post

M--
  • 25,431
  • 8
  • 61
  • 93
OB83
  • 476
  • 2
  • 10
8

may be like this

    IDs<-unique(df$ID)
    for (i in 1:length(IDs)){ 
    temp <- df[df$ID==IDs[i],]
    #more things to do with temp
    }
Ananta
  • 3,671
  • 3
  • 22
  • 26
6
    iris_split <- split(iris, iris$Species)

Dynamically you can assign the data.frame name

    new_names <- as.character(unique(iris$Species))

    for (i in 1:length(iris_split)) {
    assign(new_names[i], iris_split[[i]])
    }