This code attempts to download a page that does not exist:
url <- "https://en.wikipedia.org/asdfasdfasdf"
status_code <- download.file(url, destfile = "output.html", method = "libcurl")
This returns a 404 error:
trying URL 'https://en.wikipedia.org/asdfasdfasdf'
Error in download.file(url, destfile = "output.html", method = "libcurl") :
cannot open URL 'https://en.wikipedia.org/asdfasdfasdf'
In addition: Warning message:
In download.file(url, destfile = "output.html", method = "libcurl") :
cannot open URL 'https://en.wikipedia.org/asdfasdfasdf': HTTP status was '404 Not Found'
but the code
variable still contains a 0, even though the documentation for download.file
states that the returned value is:
An (invisible) integer code, 0 for success and non-zero for failure. For the "wget" and "curl" methods this is the status code returned by the external program. The "internal" method can return 1, but will in most cases throw an error.
The results are the same if I use curl
or wget
as the download method. What am I missing here? Is the only option to call warnings()
and parse the output?
I've seen other questions about using download.file
, but none (that I can find) that actually retrieve the HTTP status code.