Please let me start by providing synthetic data set that shows the issues:
Do <- rep(c(0,2,4,6,8,10,15,20,30,40,45,50,55,60,65,70,80,85,90,92,94,96,98,100), each=16,times=16)
Cl <- rep(c("K", "Y","M","C"), each= 384, times=4)
In <- rep(c("A", "S"), each=3072)
Sa <- rep(c(1,2), each=1536)
Data <- rnorm(6144)
DataFrame <- cbind.data.frame(Do,Cl,In,Sa,Data); head(DataFrame)
rm(Do,Cl,In,Sa,Data)
attach(DataFrame)
Next, I split the 'DataFrame' object into multiple lists to avoid unpredictable recycling. Basically, I am placing each data subset in a separate list so that cycling is predictable and that produced the correct output in my simulator.
DFSplit <- split(DataFrame[ , "Data"], list(Do, Cl, In, Sa))
The 'DFSplit' object has 384 lists
length(names(DFSplit))
Then I created the function 'ids' to identify the lists names
ids <- function(Do, Cl, In, Sa){
grep( paste( "^" , Do, "\\.",
Cl, "\\.",
In,
"\\.", Sa,sep=""),
names(DFSplit), value = TRUE)}
mapply(ids, Do, Cl, In, Sa, SIMPLIFY = FALSE)
I understand that each of 'ids' arguments' length is 6144. mapply produces 384 lists each repeated 16 times. How can I change the ids function so that mapply doesn't repeat the same name 16 times. As an ugly and highly costly solution I used unique; i need a better fundamental solution.
unique(mapply(ids, Do, Cl, In, Sa, SIMPLIFY = FALSE))
I also created a function to operate on the 'DFSplit' lists. It has the same issue as the previous function. The thing is, it accepts the previous function as an input.
dG <- function(Do,Cl, In, Sa){
dg <- 100*
(1-10^-( DFSplit[[ids(Do, Cl, In, Sa)]] - DFSplit[[ids(0, Cl, In, Sa)]])) /
(1-10^-( DFSplit[[ids(100, Cl, In, Sa)]] - DFSplit[[ids(0, Cl, In, Sa)]])) - Do
dg}
mapply(dG, Do, Cl, In, Sa, SIMPLIFY = FALSE)
What I am trying to do, unsuccessfully if I may say, is to apply the dG function inside each of the 384 lists. I acknowledge that dG function also needs to be modified and I don't know how. I want the input to the dG function to be the names of 384 lists each containing 16 numbers. I want the output to be 384 list with the dG applied.
Please feel free to suggest a different solution all together. The important thing is I need to apply the 'dG' function to the data set.