I have a list of data.frames which hold the data for each of the stages of a chemical process. Each of the data.frames has the same number of columns in the same order but the number of rows can vary for each of the data.frames.
See below the example data with the difference that fruits are standing in for chemical substances and reagents.
I've written a function to scale up the raw data and add the data to columns in the original data frames.
I have two problems, when a I apply a scale factor it only applies to the last element of the last data.frame. The new scale factor is then applied to the whole of the last data.frame. I can generate the scale factor for the next but last data frame by taking the weight of the common fruits (chemicals) between the two data frames (always the in the last and first rows) and dividing the wts in a similar manner to how we got the first scale factor ... then multiplying throughout this data.frame and repeating to get to the first data.frame. The other problem is ... when a use lapply to apply the scale_up function over the list, how can I feed it these scale factors so that each one is only applied to its particular data frame.
example.data <- list(
stage1 <- data.frame(code=c("aaa", "ooo", "bbb"),
stuff=c("Apples","Oranges","Bananas"),
Mw=c(1,2,3),
Density=c(3,2,1),
Assay=c(8,9,1),
Wt=c(1,2,3), stringsAsFactors = FALSE),
stage2 <- data.frame(code=c("bbb","mmm","ccc","qqq","ggg"),
stuff=c("Bananas","Mango","Cherry","Quince","Gooseberry"),
Mw=c(8,9,10,1,2),
Density=c(23,32,55,5,4),
Assay=c(0.1,0.3,0.4,0.4,0.9),
Wt=c(45,23,56,99,2), stringsAsFactors = FALSE),
stage3 <- data.frame(code=c("ggg","bbb","ggg","bbb"),
stuff=c("Gooseberry","Bread","Grapes","Butter"),
Mw=c(9,8,9,10),
Density=c(34,45,67,88),
Assay=c(10,10,46,52),
Wt=c(24,56,31,84), stringsAsFactors = FALSE)
)
scale_up <- function(inventory,scale_factor,vessel_volume_L, NoBatches = 1) {
## This function accepts a data.frame with Molecule, Mw, Density,
## Assay and Wt columns
## It takes a scale factor and vessel volume and returns input
## charges and fill volumes
## rownames(inventory) <- inventory$smiles
inventory <- inventory[,-1] ## the rownames are given the smiles designation
## and the smiles column is removed
## volumes and moles are calculated for the given data
inventory$Vol <- round((inventory$Wt / inventory$Density) , 3)
inventory$Moles <- round((inventory$Wt / inventory$Mw) , 3)
inventory$Equivs <- round((inventory$Moles / inventory$Moles[1]) , 3)
inventory[,paste0(scale_factor,"xWt_kg")] <- round((((inventory$Wt * scale_factor) / 1000 ) / NoBatches) , 3)
inventory[,paste(scale_factor,"xVol_L",sep="")] <- round((((inventory$Vol * scale_factor) / 1000 ) / NoBatches) , 3)
inventory$PerCentFill <- round((100 * cumsum(inventory[,paste(scale_factor,"xVol_L",sep="")]) / vessel_volume_L) , 2)
inventory
## at which point everything is in place to scale up
}
new.example.data <- lapply(example.data, scale_up,20e3,454)
> new.example.data[[1]]
stuff Mw Density Assay Wt Vol Moles Equivs 20000xWt_kg 20000xVol_L PerCentFill
1 Apples 1 3 8 1 0.333 1 1 20 6.66 1.47
2 Oranges 2 2 9 2 1.000 1 1 40 20.00 5.87
3 Bananas 3 1 1 3 3.000 1 1 60 60.00 19.09
So, I've scaled my original data (laboratory scale, grams) to see if it will fit in a ten gallon plant vessel (454 L) but the only stage that is scaled properly is the last one ... the other two need those 'fiddle factors' and I need to apply the 'fiddle factors' to each of the stages as I loop (presumably a for loop rather than lapply) through the list.
(Ps ... I tried to ask this earlier but I tried to disguise my example too much and just confused the stack overflowers).