How to load big data in R?

Question

I have 16 million customer records with more than 100 columns. I am interested in loading the complete data in R and want to run my R code on it.

I have used the following to load the data in R:

read.table("D:/data.txt",header = TRUE, sep = "þ",
           skipNul = TRUE,strip.white = TRUE,
           fill=TRUE, check.names = TRUE,na.string="NA",quote="")

However my system hung.

Is there any efficient and effective way to read in big data?

This call is checking so many things that it should be quite slow. Have you tried reducing the argument list and possibly reading in chunks? Ironically, you leave out the arguments that recommended for maximum efficiency with `read.table` — Rich Scriven, Jul 25 '14 at 05:54
If the damned thing won't fit into your available memory, you will have to read the [High performance task view](http://cran.r-project.org/web/views/HighPerformanceComputing.html). — Roman Luštrik, Jul 25 '14 at 06:08

score 5 · Answer 1 · answered Jul 25 '14 at 05:52

5

library(data.table)

DT <- fread("D:/data.txt")

If you are dealing with data of that size, you will probably want to be using data.table anyway ;)

answered Jul 25 '14 at 05:52

Ricardo Saporta

54,400
17
144
178

I have used the following: d<-fread("E:/big.txt", sep="þ", nrows=-1L, header=TRUE, na.strings="NA",stringsAsFactors=FALSE, verbose=FALSE, autostart=30L, skip=-1L, select=NULL,drop=NULL, colClasses=NULL,integer64=getOption("datatable.integer64"),showProgress=getOption("datatable.showProgress")) But It is giving the following error: Error in fread("E:/big.txt", sep = "þ", nrows = -1L, header = TRUE, na.strings = "NA", : embedded nul in string: '\0\n\07\09\07\03\00\09\01\07\05\0' – user3642360 Jul 25 '14 at 06:06

How to load big data in R?

1 Answers1