My problem resides in simple calculations over big data sets (around 25 millions rows and 10 columns, i.e. aroung 1GB data). My system is:
32bits/Windows7/4Gb Ram/R Studio 0.96, R 2.15.2
I can refer my database using BigMemory package. And use functions over my db. Also i am able to do it with ff package, filehash, etc.
The problem is while computing simple calculations (as unique values, means, etc.) i have the typical problem of
"cannot allocate vector size n mb"
, where n can be as small as 70mb - 95 mb, etc.
I know about all (i think) the solutions provided until now about this:
increase RAM.
launch R with inline code "--max-mem-size XXXX",
use memory.limit() and memory-size() commands,
use rm() and gc(),
work on 64bit,
close other programs, free memory, reboot,
use packages bigmemory, ff, filehash, sql, etc etc.
improve your data, use integers, shorts, etc. ...
check memory usage of intermediate calculations, ...
etc.
All of this is tested, done,(except moving to another system/machine, obiously) etc.
But I still get those "cannot allocate vector size n mb", where n is around 90mb for example, with really almost no memory usage from R or other programs, all of it rebooted, fresh.... I am aware of the difference between free memory and allocation from windows and R, etc, but,
It make no sense, because the memory avaiable is more than 3GB. I suspect the cause is something really under windows32b -- R memory managment, but it seems almost a joke to buy 4GB of RAM or switch all the system to 64bits, to allocate 70mb.
Is there something I am missing?