1

I searched but I couldn't find a similar question, so Apologies in advance if this is a duplicate question. I am trying to Generate a data frame from within a for loop in R.

what I want to do:

  1. Define each columns of each data frame by a function,
  2. Generate n data frames (length of my sequence of data frame) using loop,

As example I will use n=100 :

    n<-100 
    k<-8 
    d1 <- data.frame()
    for(i in 1:(k)) {d1 <- rbind(d1,c(a="i+1",b="i-1",c="i/1"))}
    d2 <- data.frame()
    for(i in 1:(k+2)) {d2 <- rbind(d2,c(a="i+2",b="i-2",c="i/2"))}
    ...

    d100 <- data.frame()
    for(i in 1:(k+100)) {d100 <- rbind(d100,c(i+100, i-100, i/100))}

It is clear that It'll be difficult to construct one by one each data.frame. I tried this:

d<-list()
for(j in 1:100) {
d[j] <- data.frame()
        for(i in 1:(k+j)) {d[j] <- rbind(d[j] ,c(i+j, i-j, i/j))}

But I cant really do anything with it, I run into an error :

Error in d[j] <- data.frame() : replacement has length zero
    In addition: Warning messages:
    1: In d[j] <- rbind(d[j], c(i + j, i - j, i/j)) :
      number of items to replace is not a multiple of replacement length

And a few more remarks about your example:

  1. the number of rows in each data frame are not the same : d1 has 8 rows, d2 has 10 rows, and d100 has 8+100 rows,
  2. the algorithm should give us : D=(d1,d2, ... ,d100).

It would be great to get an answer using the same approach (rbind) and a more base like approach. Both will aid in my understanding. Of course, please point out where I'm going wrong if it's obvious.

shadow
  • 21,823
  • 4
  • 63
  • 77
Laura Esly
  • 97
  • 9

1 Answers1

4

Here's how to create an empty data.frame (and it's not what you are trying): Create an empty data.frame

And you should not be creating 100 separate dataframes but rather a list of dataframes. I would not do it with rbind, since that would be very slow. Instead I would create them with a function that returns a dataframe of the required structure:

 make_df <- function(n,var) {data.frame( a=(1:n)+var,b=(1:n)-var,c=(1:n)/var) }

 mylist <- setNames( 
            lapply(1:100, function(n) make_df(n,n)) ,  # the dataframes
            paste0("d_", 1:100))   # the names for access

 head(mylist,3)
#---------------
$d_1
  a b c
1 2 0 1

$d_2
  a  b   c
1 3 -1 0.5
2 4  0 1.0

$d_3
  a  b         c
1 4 -2 0.3333333
2 5 -1 0.6666667
3 6  0 1.0000000

Then if you want the "d_40" dataframe it's just:

 mylist[[ "d_40" ]]

Or

 mylist$d_40

If you want to perform the same operation or get a result from all of them at nce; just use lapply:

 lapply(mylist, nrow)  # will be a list

Or:

 sapply(mylist, nrow)  #will be a vector because each value is the same length.
Community
  • 1
  • 1
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 1
    Nice! For a beginner (as the OP very likely is), also including the loop version of the `lapply` might make it clearer. Also, `make_df` could predefine the `1:n` thing. – Frank Feb 20 '15 at 23:43
  • 1
    I try to offer the "R-way". It is admittedly no faster than for-loops but whenever we can teach people methods to avoid "Fortran coding" I see it as an opportunity. – IRTFM Feb 20 '15 at 23:51