1

I am calling API to download files using curl package in R. Due to unknown reason, sometimes the connection breaks with error code: Error in curl_download(url = i, handle = h, df) : HTTP error 400.

Because I need to download about 100,000 files at a time and it's very slow, I wish R could retry once this error occurs instead of throwing me an error and stopping the script. Any thoughts?

my code is (simple version):

for (url in allUrl) {
  df <- tempfile()
    tryCatch(
      curl_download(url = url,
                    handle = h,
                    df)
    )
}
Arthur
  • 398
  • 6
  • 20

1 Answers1

3

There's a Hadley package designed exactly for this

require(httr)
maxTimes <- 10
testFilename <- "testfile.txt"

for (url in allUrl) {
  RETRY(verb = "GET", url = url, times = maxTimes,
    quiet = FALSE, terminate_on = NULL)
}

Specifially for file downloads with authentication, you can replace the RETRY command with:

GET(url, write_disk(path=testFilename, overwrite=TRUE), authenticate("user", "passwd"))
lilster
  • 921
  • 5
  • 14
  • Thanks, I tried the `GET` function, but it said `Error in write_disk(filename = df, overwrite = TRUE) : unused argument (filename = df)` – Arthur Jan 02 '18 at 21:54
  • BTW, if I stick with `RETRY`, how can I put `df` in it? And I'm not familiar with handle in `httr`, do you mind give me some instructions on how to set username and password with `httr`? Thank you! – Arthur Jan 02 '18 at 21:56
  • @Arthur, I've fixed the code for GET. The package has been updated to "path" being the parameter. – lilster Jan 02 '18 at 22:01
  • I've added the username/password to the solution. For future reference, it would be better to include that component of your code in the original question. – lilster Jan 02 '18 at 22:02
  • Thank you very much, so `GET` alone has the ability to retry? – Arthur Jan 02 '18 at 22:05
  • @Arthur, actually, I think you should instead supply those arguments to RETRY to be safe. It's been a while. i.e. use RETRY(verb = "GET", url = url, times = maxTimes, quiet = FALSE, terminate_on = NULL, write_disk(path=testFilename, overwrite=TRUE), authenticate("user", "passwd")) – lilster Jan 03 '18 at 15:15