In R usually data is loaded in RAM. Are there any packages which load the data in disk rather than RAM
2 Answers
Check out the bigmemory
package, along with related packages like bigtabulate
, bigalgebra
, biganalytics
, and more. There's also ff
, though I don't find it as user-friendly as the bigmemory
suite. The bigmemory
suite was reportedly partially motivated by the difficulty of using ff
. I like it because it required very few changes to my code to be able to access a bigmatrix
object: it can be manipulated in almost exactly the same ways as a standard matrix, so my code is very reusable.
There's also support for HDF5 via NetCDF4, in packages like RNetCDF
and ncdf
. This is a popular, multi-platform, multi-language method for efficient storage and access of large data sets.
If you want basic memory mapping functionality, look at the mmap
package.

- 20,250
- 12
- 75
- 111
-
1Bigmemory started as just an external pointer to objects in ram outside of R, plus proper semantics. The file-based stuff came in response to ff, but that didn't start bigmemory. You pointers to hdf5 and netcdf are good and correct too, as is the hint to mmap. – Dirk Eddelbuettel Feb 24 '12 at 14:28
Yes, the ff package can do that.
You may want to look at the Task View on High-Performance Computing for more details.

- 360,940
- 56
- 644
- 725