1

I am downloading a 120mb csv file from webserver using read.csv(textConnection(binarydata1)) and this is painfully slow. I tried pipe(), like this read.csv(pipe(binarydata1)) I am getting an error Error in pipe(binarydata1) : invalid 'description' argument. Any help regarding this issue is much appricated.

@jeremycg, @hrbrmstr

Suggestion

fread from the data.table package.

local storage via download.file or functions in curl or httr and use data.table::fread like @jeremycg suggested or readr::read_csv

Response

The csv file i am dealing with is in binary format, so I am converting this to standard format using these functions

t1 = getURLContent(url,userpwd,httpauth = 1L, binary=TRUE)
t2 = readBin(t1, what='character', n=length(t1)/4)

when I try fread(t2) after converting binary to standard format i get an error

Error in fread(t61) :

'input' must be a single character string containing a 
 file name, a command, full path to a file, a URL starting 
 'http://' or 'file://', or the input data itself 

If i try fread directly without converting binary to standard format then no problem it works, if I try converting binary to standard format it does not work

Ezra Polson
  • 235
  • 3
  • 13

1 Answers1

0

Even though the question is 4 years old it helped me with my current problem, where I also have a 300MB connection where read.csv took ages.

I found the vroom function from package vroom helpful here. It stored my data like a charme. It took one minute for my data where I don't even know if the read.csv(textConnection...) would get me a result (I usually terminated R after 30min. with no result).

deschen
  • 10,012
  • 3
  • 27
  • 50