How to analysis large data (larger than my available RAM size) with data.table in R?

Question

I'm learning R and I'm a big fan of data.table package - I like it's database-like syntax and performance.

When I was reading web pages and blogs on data analysis, I found this post:

A Large Data Workflow with Pandas

Data Analysis of 8.2 Million Rows with Python and SQLite

https://plot.ly/ipython-notebooks/big-data-analytics-with-pandas-and-sqlite/

I would like to practice this data analysis with data.table; however, there's only 4Gb RAM on my laptop:

➜  ~  free -m
              total        used        free      shared  buff/cache   available
Mem:           3686         966        1976         130         743        2359
Swap:          8551           0        8551
➜  ~

The dataset is a 3.9Gb CSV file, my availalbe memory is not enough to read the file as a data.table. But I'm not willing to give up data.table package.

Question:

Is there a database interface for data.table package? I searched its documentation and have no good luck.
If data.table is not the right tool for this task, which approach is highly recommended?

(1) sqldf or (2) sqlite + dplyr or (3) ff/bigmemory package?

I've noticed that each of above packages has distinctive syntax. The pandas in the linked post can do almost all these task in one set of tools. Is there possibly a similar approach in R?

I don't think there is a database interface for `data.table`, but `dplyr` has one. Though it depends on what exactly are you trying to do. You can potentially load the data in chunks and work on each chunk at a time for example. Take a look at both answers [here](http://stackoverflow.com/questions/21435339/data-table-vs-dplyr-can-one-do-something-well-the-other-cant-or-does-poorly) — David Arenburg, Aug 30 '15 at 11:19
@eddi Good point. I do plan to upgrade my computer. It support at most `4 Gb x 2` RAM. I currently have `2 Gb x 2`. So I have to give away these old RAM and buy two new one. Maybe next year, `8 Gb` is not enough, maybe I should buy a new computer. — Nick, Sep 02 '15 at 01:09

How to analysis large data (larger than my available RAM size) with data.table in R?

0 Answers0