0

I have the following URL objects and need to check if they are reachable before downloading and processing the CSV files. I can't use the URLs directly as it keeps on changing based on previous steps.

My requirement is, read the link if reachable else throw an error and go to the next link.

url1= "https://s3.mydata.csv"
url2="https://s4.mydata.csv"
url3="https://s5.mydata.csv"  

(Below code will be repeated for the other 2 URLs as well)

readUrl <- function(url1) {
  out <- tryCatch(
    {
readLines(con=url, warn=FALSE)
error=function(cond) {
            message(cond)
      
      return(NA)
    },
finally={
      
      dataread=data.table::fread(url1, sep = ",", header= TRUE,verbose = T, 
                            fill =TRUE,skip = 2 )
    }
  )    
  return(out)
}

y <- lapply(urls, readUrl)

MSM
  • 69
  • 7

1 Answers1

2

Why not the function url.exists directly from package RCurl.

From documentation:

This functions is analogous to file.exists and determines whether a request for a specific URL responds without error.

Function doc LINK

Using the boolean result of this function you can easly adapt your starting code without Try Catch.

Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
  • But I can't explicitly give the URL links as it changes dynamically. Will an object assigned to an URL can also be used for url.exists function? – MSM Feb 18 '21 at 11:07
  • Yes, you can pass a variable, which dynamically has assigned an URL, to the function – Terru_theTerror Feb 18 '21 at 11:19
  • I tried the below code, but for both the URLs it gives me FALSE as output(i.e url is not reachable ). url_exists <- function(x) url.exists(as.character(x)) abc="https://s9.filedata.com/inventory/2020_feb.csv" xyz="https://s9.filedata.com/inventory/2020_jan.csv" df=data.frame('urls', c("abc","xyz")) df_exist <- mutate(df, exist = sapply(urls, url_exists)) – MSM Feb 18 '21 at 12:57
  • Maybe is a proxy problem like here: https://stackoverflow.com/questions/40391047/why-url-exists-returns-false-when-the-url-does-exists-using-rcurl – Terru_theTerror Feb 18 '21 at 15:14