0

Here is a simple made up data set:

df1 <- data.frame(x = c(1,2,3),
            y = c(4,6,8),
            z= c(1, 6, 7))

df2 <- data.frame(x = c(3,5,6),
              y = c(3,4,9),
              z= c(6, 7, 7))

What I want to do is to create a new variable "a" which is just the sum of all three variables (x,y,z)

Instead of doing this separately for each dataframe I thought it would be more efficient to just create a loop. So here is the code I wrote:

my.list<- list(df1, df2)

for (i in 1:2) {
my.list[i]$a<- my.list[i]$x +my.list[i]$y + my.list[i]$z

}

or alternatively

for (i in 1:2) {
my.list[i]<- transform(my.list[i], a= x+ y+ z)

}

In both cases it does not work and the error "number of items to replace is not a multiple of replacement length" is returned.

What would be the best solution to writing a loop code where I can loop through dataframes?

w.kye
  • 17
  • 2

2 Answers2

0

See ?Extract:

Recursive (list-like) objects

Indexing by [ is similar to atomic vectors and selects a list of the specified element(s).

Both [[ and $ select a single element of the list.

In short, my.list[i] returns a list of length 1, and you are trying to assign it a data.frame, so that doesn't work; whereas my.list[[i]] returns the data.frame #i in your list, which you can replace with a data.frame.

So you can use either:

for (i in 1:2) {
  my.list[[i]]$a<- my.list[[i]]$x +my.list[[i]]$y + my.list[[i]]$z

}

or

for (i in 1:2) {
  my.list[[i]]<- transform(my.list[[i]], a= x+ y+ z)

}

But it would be even simpler to use lapply, where you don't need [[:

my.list <- lapply(my.list, function(df) df$a <- df$x + df$y + df$z)
scoa
  • 19,359
  • 5
  • 65
  • 80
0

Rather than using an explicit loop to extract the data.frames from the list, just use lapply. It takes a list of data.frames (or any object) and a function, applies the function to every element of the list, and returns a list with the results.

# Sample data
df1 <- data.frame(x = c(1,2,3), y = c(4,6,8), z = c(1, 6, 7))
df2 <- data.frame(x = c(3,5,6), y = c(3,4,9), z = c(6, 7, 7))

# Put them in a list
df_list <- list(df1, df2)

# Use lapply to iterate. FUN takes the function you want, and
# then its arguments (a = x + y + z) are just listed after it.
result_list <- lapply(df_list, FUN = transform, a = x + y + z)
Matt Parker
  • 26,709
  • 7
  • 54
  • 72