I previously posted a question of how to create all possible combinations of a set of dataframes or the "power set" of possible data frames in this link: Creating Dataframes of all Possible Combinations without Repetition of Columns with cbind
I was able to create the list of possible dataframes by first creating all possible combinations of the names of the dataframes, and storing them in Ccols
, a section of which looks like this:
using reduce
and lapply
, I then called each dataframe by its name, and stashed in lists, then stashed all those lists in a list of list to calculate the Means and Covariances:
ll_cov<- list()
ll_ER<- list()
for (ii in 2:length(Ccols)){
l_cov<- list()
l_ER<- list()
for (index in 1:ncol(Ccols[[ii]])){
ls<-list()
for (i in 1:length(Ccols[[ii]][,index]) ){
KK<- get(Ccols[[ii]][i,index])
ls[[i]] <-KK
}
DAT<- transform(Reduce(merge, lapply(ls, function(x) data.frame(x, rn = row.names(x)))), row.names=rn, rn=NULL)
l_cov[[index]]<- cov(DAT)
l_ER[[index]]<- colMeans(DAT)
}
ll_cov[[ii]]<- l_cov
ll_ER[[ii]]<- l_ER
}
However, the Loop is becoming too time-consuming due to the high number of dataframes being processed and cov
and colMeans
calculations. I searched and came across this example ( Looping over a list of data frames and calculate the correlation coefficient ) which mentions listing data frames and then applying cov
as a function, but it still running way too slowly. I tried removing one of the loops by introducing one lapply
instead of the very outer loop:
Power_f<- function(X){
l_D<- list()
for (index in 2:ncol(X)){
ls<-list()
for (i in 1:length(X[,index]) ){
KK<- get(X[i,index])
ls[[i]] <-KK
}
DAT<- transform(Reduce(merge, lapply(ls, function(x) data.frame(x, rn = row.names(x)))), row.names=rn, rn=NULL)
l_D[[index]]<- (DAT)
}
return(l_D)
}
lapply(seq(from=2,to=(length(Ccols))), function(i) Power_f(Ccols[[i]]))
But it is still taking too long to run (I am not getting results). Is there a way to replace all the for
looping with lapply
and make it computationally efficient?