I am having issues casting a data frame that is rather large, bumping into memory issues. Alternatively, there is probably a better way to do this. I am open to suggestions on either front to make it work better. The issue is such;
library(reshape)
dataf <- data.frame( gridID = rep(c(1,2,3,4),4000), montecarlo = rep(1:1000,each=4), number=runif(1600,0,1) )
castData <- cast(dataf, gridID ~ montecarlo, value='number')
This takes an incredibly long time for some of my data sets. Think a data frame that has 500,000 unique gridID values with 1000 montecarlo simulations for each (5,000,000 rows of data).
I'm getting this error as I write this question: Aggregation requires fun.aggregate: length used as default
However the coding is working in my script.... with no errors or warnings, it just takes a long time for my larger data frames. I am trying to avoid using a function (sum, mean, etc) on the value as there can only be one value per gridID ~ montecarlo and I figured that was also a large waste of time due to the computation.
The newly cast data frame is then multiplied by another data frame in the same format, 500,000 rows of data with 1000 columns (each representing the monte carlo iteration value), and goes through some more processes.
Any suggestions for dealing with these large data frames or speeding things up?