I have a 20000 * 5 data set. Currently it is being processed in an iterative manner and the data set gets updated continuously on every iteration.
The cells in the data.frame gets updated for every iteration and looking for some help in running these things faster. Since this is a small data.frame I'm not sure if data.table would work fine.
Here are the benchmarks for data.frame subassignment:
sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
set.seed(1234)
test <- data.frame(A = rep(LETTERS , 800), B = rep(1:26, 800), C=runif(20800), D=runif(20800) , E =rnorm(20800))
microbenchmark::microbenchmark(test[765,"C"] <- test[765,"C"] + 25)
Unit: microseconds
expr min lq mean median uq max neval
test[765, "C"] <- test[765, "C"] + 25 112.306 130.8485 979.4584 186.3025 197.7565 44556.15 100}
Is there a way to achieve the above function faster than what I have posted?