Example of bigmemory and friends with file backing

Question

I am interested in exploring how R can handle data out-of-memory. I've found the bigmemory package and friends (bigtabulate and biganalytics), but was hoping that someone could point me to a worked out example that uses file backing with these packages. Any other out-of-memory tips would also be appreciated.

score 8 · Accepted Answer · edited Feb 15 '11 at 16:01

8

Charlie, just email Mike and Jay, they have a number of examples working around the ASA 'flights' database example from a year or two ago.

Edit: In fact, the Documentation tab has what I had in mind; the scripts are also on the site.

edited Feb 15 '11 at 16:01

JD Long

59,675
58
202
294

answered Feb 14 '11 at 21:45

Dirk Eddelbuettel

360,940
56
644
725

1

Chill: they are the authors of the project in question. OP knows. Moreover, I provided a *link* to the project site which has *further links* for Jay and Mike. Did you bother to check? – Dirk Eddelbuettel Feb 15 '11 at 13:50
That's great, Dirk. I hadn't seen their website, only the CRAN materials. Thanks. – Charlie Feb 15 '11 at 16:19

score 3 · Answer 2 · answered Feb 15 '11 at 16:34

3

Take a look at "CRAN Task View: High-Performance and Parallel Computing with R". There is a chapter "Large memory and out-of-memory data" where severel solutions are mentioned. For example package ff.

answered Feb 15 '11 at 16:34

djhurio

5,437
4
27
48

I had heard of `ff`, but I didn't realize that it was an on-disk solution. Thanks. – Charlie Feb 15 '11 at 18:01

score 2 · Answer 3 · answered Feb 14 '11 at 23:39

Any other out-of-memory tips would also be appreciated.

I frequently work with large datasets. Even though my code has been optimized, I still launch Amazon EC2 instances from time to time because it gives me access to far more resources than on my desk. For example, an instance with 26 ECUs, 8 cores, and 68 gigs of RAM only costs about a $0.80-1.00 per hour (spot instance pricing).

If that seems reasonable, you can launch a public machine image that already has R and do this job in no time.

I've heard about EC2, but I didn't realize that it was so inexpensive. Thanks! — Charlie, Feb 15 '11 at 16:18

Example of bigmemory and friends with file backing

3 Answers3

Linked