3

I am downloading a series of urls which are JSON files, into a list of list, to be analyzed later.

    baseurl <- "http://zoeken.kvk.nl/Address.ashx?site=handelsregister&partialfields=&q=010"
    pages <- list()

    for(i in 1:99999){
      if(i < 10000){
        message("ignoring page ", i)
      }
      if(i >= 10000){
        message("Retrieving page ", i)
        mydata <- RJSONIO::fromJSON(paste0(baseurl,i), flatten=TRUE) 

        pages[[i+1]] <- mydata$resultatenHR
# adding adjustment 1
        options(timeout = 4000000)
# adding adjustment 2
        if(i %% 100 == 0){Sys.sleep(2)}
        if(i %% 1000 == 0){Sys.sleep(10)}
      }

    }

However, at irrelugar moments, I get either the error code:

error in open.connection(con, "rb") : Recv failure: Connection was reset. 

or

Error in file(con, "r") : cannot open the connection

I first tried the abovementioned adjustment 1, and then adjustment 2, but the problem keeps on coming. If i try restarting the loop at the point of error, it works again untill the following irregular error.

How can I build in that R restarts the loop at the point of error automatically?

NB I have seen the other topics on error in open.connection, but I did not understand the answers given, or it was not applicable to my type of code i think...

NB2: I have also tried using the jsonlite package instead of RJSONIO, but it gave the same errors at irregular moments. Thanks for your input.

zx8754
  • 52,746
  • 12
  • 114
  • 209
RobertHaa
  • 79
  • 2
  • 10

1 Answers1

0

I have almost exactly the same problem. It happens especially when I am trying to download larger datasets. I get this kind of an error message: "Error in open.connection(con, "rb") : Send failure: Connection was reset"

  final_results <- list()

  while(i < number){
      query <- paste0(url_start, i)
      json_result <- fromJSON(query)
      final_results[[i]] <- as.data.frame(json_result$records)
      i <- i+1
  }

Does anyone have an idea what I am doing wrong here?