I won't pretend that this code is even remotely optimal, but here is the problem I have. I have a list of files with multiple columns read in with sapply()
, such that if I call file.list[[1]]
I get a summary of that data.frame, and summary(file.list)
is a list of files.
I am fitting curves to the data using the mgcv
package as follows:
gam_data <- function(curves)
{
out <- gam(curves[, 15] ~ s(curves[, 23]))
pd <- plot(out)
return(pd)
}
out <- lapply(file.list, gam_data)
split_curves <- function(splitting)
{
pd_2 <- c(splitting[[1]]$fit)
pd_3 <- c(splitting[[1]]$x)
pd_4 <- c(splitting[[1]]$se)
curveg <- cbind(pd_2, pd_3, pd_4)
colnames(curveg) <- c("fitted", "sphro", "se")
return(curveg)
}
out2 <- lapply(out, split_curves)
Where the first block is performing gam and the second is extracting the fit of the curve. However, after all of that the original information in file.list such as replicate, genotype, etc. is lost, and the data.frames are not the same length anymore. This is probably a trivial question, but how does one retain that information through processing? I'm applying this to hundreds of data frames so I cannot just manually recreate the columns.