1

I am getting an error from fread:

Internal error: ch>eof when detecting eol

when trying to read a csv file downloaded from an https server, using R 3.2.0. I found something related on Github, https://github.com/Rdatatable/data.table/blob/master/src/fread.c, but don't know how I could use this, if at all. Thanks for any help.

Added info: the data was downloaded from here:

fileURL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv"

then I used

download.file(fileURL, "Idaho2006.csv", method = "Internal") 
mts
  • 2,160
  • 2
  • 24
  • 34
Jane Quigley
  • 11
  • 1
  • 3
  • 1
    can you show your code and possibly data to reproduce the error? see http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610 for providing a reproducible example. – mts Jul 10 '15 at 18:25
  • please also show your code, especially how you call `fread` which seems to give you the error – mts Jul 10 '15 at 18:47
  • mts, this time it worked correctly, so I think maybe the problem was how I was using fread (maybe because I stupidly didn't assign it to an object, eg DT <- fread("Idaho.csv"). Thanks for your patience with a beginner. – Jane Quigley Jul 10 '15 at 19:01

3 Answers3

4

The problem is that download.file doesn't work with https with method=internal unless you're on Windows and set an option. Since fread uses download.file when you pass it a URL and not a local file, it'll fail. You have to download the file manually then open it from a local file.

If you're on Linux or have either of the following already then do method=wget or method=curl instead

If you're on Windows and don't have either and don't want to download them then do setInternet2(use = TRUE) before your download.file

http://www.inside-r.org/r-doc/utils/setInternet2

For example:

fileURL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv"
tempf <- tempfile()
download.file(fileURL, tempf, method = "curl")
DT <- fread(tempf)
unlink(tempf)

Or

fileURL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv"
tempf <- tempfile()
setInternet2 = TRUE
download.file(fileURL, tempf)
DT <- fread(tempf)
unlink(tempf)
Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72
2

fread() now utilises curl package for downloading files. And this seems to work just fine atm:

require(data.table) # v1.9.6+
fread(fileURL, showProgress = FALSE)
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Arun
  • 116,683
  • 26
  • 284
  • 387
0

The easiest way to fix this problem in my experience is to just remove the s from https. Also remove the method you don't need it. My OS is Windows and i have tried the following code and works.

fileURL <- "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv"
download.file(fileURL, "Idaho2006.csv") 
ksaittis
  • 51
  • 1
  • 7