I feel like this should have a really simple/elegant solution but I just can't find it. (I'm relatively new to r so that's no surprise.)
I have a (large) nested list containing data.frames that I'm trying to add together. Here is code to create some sample data:
#Create data frames nested in a list
for (i in 1:6) {
for (j in 1:4) {
assign(paste0("v", j), sample.int(100,4))
}
assign(paste0("df", i), list(cbind(v1, v2, v3, v4)))
}
inner1 <- list(data1 = df1, data2 = df2)
inner2 <- list(data1 = df3, data2 = df4)
inner3 <- list(data1 = df5, data2 = df6)
outer <- list(group1 = inner1, group2 = inner2, group3 = inner3)
I need to add all the data frames labeled data1
together and all the data2
's together. If they weren't in this nested list format, I'd do this:
data1.tot <- df1 + df3 + df5
data2.tot <- df2 + df4 + df6
Because they are in a list, I thought there might be an lapply
solution and tried:
grp <- c("group1", "group2", "group3") #vector of groups to sum across
datas <- lapply(outer, "[[", "data1") #select "data1" from all groups
tot.datas <- lapply(datas[grp], "+") #to sum across selected data
#I know these last two steps can be combined into one but it helps me keep everything straight to separate them
But it returns Error in FUN(left): invalid argument to unary operator
because I'm passing the list of datas as x
.
I've also looked at other solutions like this one: Adding selected data frames together, from a list of data frames
But the nested structure of my data makes me unsure of how to translate that solution to my problem.
And just to note, the data I'm working with are GCHN Daily data, so the structure is not my design. Any help would be greatly appreciated.
UPDATE:
I've partially figured out a fix using the suggestion of Reduce
by @Parfait, but now I need to automate it. I'm working on a solution using a for
loop because that gives me more control over the elements I'm accessing, but I'm open to other ideas. Here is the manual solution that works:
get.df <- function(x, y, z) {
# function to pull out the desired data.frame from the list
# x included as argument to make function applicable to my real data
output <- x[[y]][[z]]
output[[1]]
}
output1 <- get.df(x = outer, y = "group1", z = "data1")
output2 <- get.df(x = outer, y = "group2", z = "data1")
data1 <- list(output1, output2)
data1.tot <- Reduce(`+`, data1)
Using my sample data, I'd like to loop this over 2 data types ("data1" and "data2") and 3 groups ("group1", "group2", "group3"). I'm working on a for
loop solution, but struggling with how to save output1
and output2
in a list. My loop looks like this right now:
dat <- c("data1", "data2")
grp <- c("group1", "group2", "group3")
for(i in 1:length(dat)) {
for(j in 1:length(grp)) {
assign(paste0("out", j), get.df(x = outer, y = grp[j], z = dat[i]))
}
list(??? #clearly this is where I'm stuck!
}
Any suggestions either on the for
loop problem, or for a better method?