I am using the blackboost function from the mboost package to estimate a model on an approximately 500mb dataset on a Windows 7 64-bit, 8gb RAM machine. During the execution R uses up to virtually all available memory. After the calculation is done, over 4.5gb keeps allocated to R even after calling the garbage collection with gc() or saving and reloading the workspace to a new R session. Using .ls.objects (1358003) I found that the size of all visible objects is about 550mb.
The output of gc() tells me that the bulk of data is in vector cells, although I'm not sure what that means:
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 2856967 152.6 4418719 236.0 3933533 210.1
Vcells 526859527 4019.7 610311178 4656.4 558577920 4261.7
This is what I'm doing:
> memory.size()
[1] 1443.99
> model <- blackboost(formula, data = mydata[mydata$var == 1,c(dv,ivs)],tree_control=ctree_control(maxdepth = 4))
...a bunch of packages are loaded...
> memory.size()
[1] 4431.85
> print(object.size(model),units="Mb")
25.7 Mb
> memory.profile()
NULL symbol pairlist closure environment promise language
1 15895 826659 20395 4234 13694 248423
special builtin char logical integer double complex
174 1572 1197774 34286 84631 42071 28
character ... any list expression bytecode externalptr
228592 1 0 79877 1 51276 2182
weakref raw S4
413 417 4385
mydata[mydata$var == 1,c(dv,ivs)] has 139593 rows and 75 columns with mostly factor variables and some logical or numerical variables. formula is a formula object of the type: "dv ~ var2 + var3 + .... + var73". dv is a variable name string and ivs is a string vector with all independent variables var2 ... var74.
Why is so much memory being allocated to R? How can I make R free up the extra memory? Any thoughts appreciated!