I know that there is no += operator in R (R: += (plus equals) and ++ (plus plus) equivalent from c++/c#/java, etc.?)
The following function iterating over a bunch of things and accumulating the difference is slower than expected. The largest runtime component is a simple addition, and Rprof shows that ~20% of the runtime is garbage collection in the following function. This feels like we are constructing many copies of the ~1e7 records in myVector and discarding them, whereas an in-place operation shouldn't have this overhead
testCase <- function(f) {
difference<-rep(0,f)
# replicate approximate code structure of original test case
for(i in 1:10){
if(runif(1,0,1)>.1) {
partialDiff<-dd(f)#
}
else {
partialDiff<-dd2(f)#
}
difference<-difference+partialDiff
}
return(difference)
}
dd <- function(f) {
result <- (1:f)
return(result)
}
dd2 <- function(f) {
result <- sqrt(1:f)
return(result)
}
# now for actual profiling ; build the D i want 50 times, just to get a statistical profile of what is going on there
Rprof(tmp <- tempfile(), gc.profiling = TRUE, interval=".01")
for(i in 1:50) {
d<-testCase(10000000)
}
Rprof()
require(proftools)
summaryRprof(tmp)
pd <- readProfileData(tmp)
#funSummary(pd);callSummary(pd);pathSummary(pd);hotPaths(pd)
plot(pd)
using Rprof to detail the runtime i get the following breakdown of runtime with dd sequence creation (as expected) , calc2 showing sqrt as the biggest cost(as expected), but the largest cost in f is the effort of doing the addition "difference<-difference+partialDiff". This is not expected.
Can anyone explain why, or offer an efficent workaround?
- Testing the length of difference, it isn't altered from its initialization.
- class(partialDiff)==class(difference)=="numeric"
- length(partialDiff)==length(difference) == about 17M in the test case profiled here.
- note f returns a f by 1 vector, not a f by 10 matrix
edit: in response to comments i have replaced my simple code structure with a complete test case program