I am working on creating a new dataframe from a large three dimensional array using a nested R loop. I have tried running the code and either the job craps out after ~48 hours. The current code to perform the nested loop is shown below. I would really like to vectorize the loop to make it more efficient but am unsure how or if that's possible over a multi-dimensional array. Any suggestions for how to improve the efficiency of job is very much appreciated. For reference my_array is a small piece of my array with two slices. The data in the array is a probability value and the loop finds the founder with max probability value at a specific mouse&marker. The final output is a dataframe with mice names as rows, markers with columns, and the founder as the data. Example code is below.
founder_names <- rownames(model.probs[1,,])
mice_names <- rownames(model.probs[,1,])
marker_names <- colnames(model.probs[1,,])
# Create empty data frame
probs.df <- data.frame()
## Instructions for nested loop
for(marker in marker_names) {
for(mouse in mice_names){
probs.df[mouse, marker] = names(which.max(my_array[mouse,,marker]))
}
}
Example Data from dput(my_array)
:
structure(c(1.86334813592728e-08, 2.02070595143633e-10, 2.1558577630356e-08,
2.1558577630356e-08, 2.04388477395613e-10, 2.04388477395593e-10,
2.04388477395613e-10, 2.031707697502e-10, 2.04388477395593e-10,
2.0317076975018e-10, 0.999999939150967, 1.19701878645413e-10,
2.94522644878888e-08, 2.94522644878888e-08, 1.20988752710968e-10,
1.20988752710968e-10, 1.20988752710968e-10, 1.20313358746148e-10,
1.20988752710968e-10, 1.20313358746148e-10, 2.41632503275453e-12,
2.53195197455819e-08, 2.89630046322804e-12, 2.89630046322804e-12,
2.46380958026699e-08, 2.46380958026699e-08, 2.46380958026724e-08,
2.44127737551662e-08, 2.46380958026699e-08, 2.44127737551638e-08,
1.08633475857376e-12, 0.999999925628544, 1.30167423493078e-12,
1.30167423493078e-12, 2.49445205965502e-08, 2.49445205965502e-08,
2.49445205965527e-08, 2.47171256696929e-08, 2.49445205965502e-08,
2.47171256696904e-08, 1.84322523200704e-08, 6.29795050516582e-11,
2.13175870442828e-08, 2.13175870442849e-08, 6.40871335417646e-11,
6.40871335417646e-11, 6.40871335417646e-11, 6.35035199711943e-11,
6.40871335417646e-11, 6.3503519971188e-11, 0.999999939821495,
2.75475678555388e-11, 2.91247770927105e-08, 2.91247770927134e-08,
2.80325925630150e-11, 2.80325925630123e-11, 2.80325925630150e-11,
2.77773153893157e-11, 2.80325925630123e-11, 2.77773153893129e-11,
6.56947829427486e-13, 2.50477863870057e-08, 7.89281798086196e-13,
7.89281798086277e-13, 2.43639980473783e-08, 2.43639980473783e-08,
2.43639980473783e-08, 2.41399147887054e-08, 2.43639980473783e-08,
2.4139914788703e-08, 1.7742262257411e-13, 0.999999926913761,
2.13166988220277e-13, 2.13166988220277e-13, 2.46686866862984e-08,
2.46686866862984e-08, 2.46686866863009e-08, 2.44425383948499e-08,
2.46686866862984e-08, 2.44425383948499e-08), .Dim = c(10L, 4L,
2L), .Dimnames = list(c("B6HER2", "X100", "X1002", "X1005", "X1006",
"X1007", "X1010", "X1011", "X1012", "X1014"), c("AI", "BI", "CI",
"DI"), c("UNC6", "JAX00000010")))