18

I see that many examples for downloading binary files with RCurl are like such:

library("RCurl")
curl = getCurlHandle()
bfile=getBinaryURL (
        "http://www.example.com/bfile.zip",
        curl= curl,
        progressfunction = function(down, up) {print(down)}, noprogress = FALSE
)
writeBin(bfile, "bfile.zip")
rm(curl, bfile)

If the download is very large, I suppose it would be better writing it concurrently to the storage medium, instead of fetching all in memory.

In RCurl documentation there are some examples to get files by chunks and manipulate them as they are downloaded, but they seem all referred to text chunks.

Can you give a working example?

UPDATE

A user suggests using the R native download file with mode = 'wb' option for binary files.

In many cases the native function is a viable alternative, but there are a number of use-cases where this native function does not fit (https, cookies, forms etc.) and this is the reason why RCurl exists.

antonio
  • 10,629
  • 13
  • 68
  • 136
  • 1
    `download.file` doesn't read into RAM.. can you provide an example file to download that `download.file` doesn't work on? :) – Anthony Damico Jan 20 '13 at 16:16

2 Answers2

20

This is the working example:

library(RCurl)
#
f = CFILE("bfile.zip", mode="wb")
curlPerform(url = "http://www.example.com/bfile.zip", writedata = f@ref)
close(f)

It will download straight to file. The returned value will be (instead of the downloaded data) the status of the request (0, if no errors occur).

Mention to CFILE is a bit terse on RCurl manual. Hopefully in the future it will include more details/examples.

For your convenience the same code is packaged as a function (and with a progress bar):

bdown=function(url, file){
    library('RCurl')
    f = CFILE(file, mode="wb")
    a = curlPerform(url = url, writedata = f@ref, noprogress=FALSE)
    close(f)
    return(a)
}

## ...and now just give remote and local paths     
ret = bdown("http://www.example.com/bfile.zip", "path/to/bfile.zip")
antonio
  • 10,629
  • 13
  • 68
  • 136
  • Is there any reason why this would *not* work when running R in BATCH? – MikeTP Oct 28 '14 at 01:01
  • 1
    If using this solution in a package you probably need to change the `close(f)` to `RCurl::close(f)` otherwise you may run into errors that it can't find the close method for CFILE. – Dan Tenenbaum Aug 20 '15 at 19:50
3

um.. use mode = 'wb' :) ..run this and follow along w/ my comments.

# create a temporary file and a temporary directory on your local disk
tf <- tempfile()
td <- tempdir()

# run the download file function, download as binary..  save the result to the temporary file
download.file(
    "http://sourceforge.net/projects/peazip/files/4.8/peazip_portable-4.8.WINDOWS.zip/download",
    tf ,
    mode = 'wb' 
)

# unzip the files to the temporary directory
files <- unzip( tf , exdir = td )

# here are your files
files
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • It works, thanks. I didn't check 'wb', because with other sites it works without. E.g. `download.file("http://www.nirsoft.net/utils/gdiview.zip", "gdiview.zip")`, so I attributed it to the redirection method of sf.net. Now, since I use EMACS ESS, I have to solve the problem of showing the progress bar, which is not on the console, as RCurl, but uses a GUI widget. – antonio Jan 20 '13 at 20:42
  • @antonio mark it as accepted and also edit the title, since the answer doesn't involve RCurl :P – Anthony Damico Jan 20 '13 at 21:35
  • While yours is a good alternative, I think investigating RCurl binary downloads (`R_curl_write_binary_data` etc.) is still interesting. – antonio Jan 22 '13 at 09:29
  • it may be interesting but it's not the answer to the question you posed ;) – Anthony Damico Jan 22 '13 at 15:12
  • I wrote: “In RCurl documentation there are some examples to get files by chunks [...], but they seem all referred to _text chunks_. Can you give a working example [for _binary ones_]?” You asked me to give you an example with `download.file` and I did. You asked me to move it to the question and I did. I thank you again for your "wb" hint, but I was and I am, still, interested to RCurl. – antonio Jan 22 '13 at 16:23