The goal of the below code is to perform recursive and iterative analysis on a data set that has 400 columns and 6000 rows. It takes, two columns at a time and performs analysis on it, before moving to all the possible combinations.
Small sub set of large data set being used:
data1 data2 data3 data4
-0.710003 -0.714271 -0.709946 -0.713645
-0.710458 -0.715011 -0.710117 -0.714157
-0.71071 -0.714048 -0.710235 -0.713515
-0.710255 -0.713991 -0.709722 -0.71397
-0.710585 -0.714491 -0.710223 -0.713885
-0.710414 -0.714092 -0.710166 -0.71434
-0.711255 -0.714116 -0.70945 -0.714173
-0.71097 -0.714059 -0.70928 -0.714059
-0.710343 -0.714576 -0.709338 -0.713644
Code using apply()
:
# Function
analysisFunc <- function () {
# Fetch next data to be compared
nextColumn <<- currentColumn + 1
while (nextColumn <= ncol(Data)){
# Fetch the two columns on which to perform analysis
c1 <- Data[, currentColumn]
c2 <- Data[, nextColumn]
# Create linear model
linearModel <- lm(c1 ~ c2)
# Capture model data from summary
modelData <- summary(linearModel)
# Residuals
residualData <- t(t(modelData$residuals))
# Keep on appending data
linearData <<- cbind(linearData, residualData)
# Fetch next column
nextColumn <<- nextColumn + 1
}
# Increment the counter
currentColumn <<- currentColumn + 1
}
# Apply on function
apply(Data, 2, function(x) analysisFunc ())
I thought instead of using loops, apply()
will help me optimize the code. However, it seems to have no major effect. Run time is more than two hours.
Does anyone think, I am going wrong on how apply()
has been used? Is having while()
within apply()
call not a good idea? Any other way I can improve this code?
This is first time I am working with functional programming. Please let me know your suggestion, thanks.