I am writing a program to collect all of the daily .csv files from this page. However, for some of the files, I get the error message:
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
cannot open URL 'https://www.eride.ri.gov/eride2K5/AggregateAttendance/Data/05042016_DailyAbsenceData.csv': HTTP status was '404 Not Found'
Here is an example from the May 12, 2016 file:
read.csv(url("https://www.eride.ri.gov/eride2K5/AggregateAttendance/Data/05122016_DailyAbsenceData.csv"))
The bizarre thing is, if you go to the website, find the link to that file and click it, R no longer gives the error and reads the file correctly. What is going on here and how can I read those files without having to click them manually? (Note, only the first one of you is going to be able to replicate the problem, because clicking the file fixes it for the rest.)
Ultimately, I want to use the following loop to collect all the files:
# Create a vector of dates. This is the interval data is collected from.
dates = seq(as.Date("2016-05-1"), as.Date("2016-05-30"), by="days")
# Format to match the filename prefixes
dates = strftime(dates, '%m%d%Y')
# Create the vector of a file names I want read.
file.names = paste(dates,"_DailyAbsenceData.csv", sep = "")
# A loop that reads the .csv files into a list of data frame
daily.truancy = list()
for (i in 1:length(dates)) {
tryCatch({ #this function prevents the loop from stopping from an error when read.csv cannot access the file
daily.truancy[[i]] = read.csv(url(paste("https://www.eride.ri.gov/eride2K5/AggregateAttendance/Data/", file.names[i], sep = "")), sep = ",")
stop("School day") #this indicates that the file was successfully read in to the list
}, error=function(e){cat("ERROR :",conditionMessage(e), "\n")})
}
# Unlist the daily data to a large panel
daily.truancy.2016 <- do.call("rbind", daily.truancy)
Note that the same error message is given for days when there is, in fact, no file (weekends). This is not a problem.