2

So I have multiple data frames and I'm trying to calculate the sum of specific columns and store within a new column in the data frame for EACH data frame and I'm not sure how to go about it. So far, I can run a for loop for a single dataframe:

for (i in nrow(df1)){df1$newcolumn <-(df1$a + df1$b + df1$c)}

But if I have multiple data frames (df1,df2,df3,...), how do I do this? The column names are the same for each dataframe.

Thank you!

Becky
  • 21
  • 2
  • Use a [list-of-frames](https://stackoverflow.com/a/24376207/3358272), and then do `updated_list_of_frames <- lapply(list_of_frames, function(single_frame) ...)` – r2evans Jun 07 '20 at 04:00

3 Answers3

2

If your dataframe is called df1, df2 etc, you can use this pattern to get dataframe in a list using mget and add a new column in each dataframe using transform.

new_data <- lapply(mget(ls(pattern = 'df\\d+')), function(df) 
                   transform(df, newcolumn = a + b + c))

This will return list of dataframes, if you want them as individual dataframes again use list2env.

list2env(new_data, .GlobalEnv)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Two other approaches.

# create example data
df1 <- df2 <- data.frame(x=1:4, y=1:4)

# put into a list
l <- list(df1, df2)

# iterate over the list with a for loop
for(i in 1:length(l)){
  l[[i]]$new_column <- l[[i]]$x + l[[i]]$y
}

# same as above, but using `lapply()` and an anonymous function
# this requires you have the package `dplyr`
lapply(l, function(j) dplyr::mutate(j, new_column = x + y))

both return:

[[1]]
  x y new_column
1 1 1          2
2 2 2          4
3 3 3          6
4 4 4          8

[[2]]
  x y new_column
1 1 1          2
2 2 2          4
3 3 3          6
4 4 4          8

And as shown above, to access individual list elements, which we've made data.frames in this example, use double bracket notation ([[):

> l[[1]]
  x y new_column
1 1 1          2
2 2 2          4
3 3 3          6
4 4 4          8
Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69
0

With tidyverse, we can do

library(dplyr)
library(purrr)
new_data <- lmget(ls(pattern = '^df\\d+$')) %>%
        map(~ .x %>%
                  mutate(newcolumn = a + b + c))

if we need individual datasets

list2env(new_data, .GlobalEnv)
akrun
  • 874,273
  • 37
  • 540
  • 662