I am trying to check periodically for date of latest downloadable files that are added to page https://github.com/mrc-ide/global-lmic-reports/tree/master/data, where the file names are like 2021-05-22_v8.csv.zip
There is a code snip mentioned in Using R to scrape the link address of a downloadable file from a web page? that can be used with a tweak, and identifies the date of the first or earliest downloadable file on a web page, shown below.
library(rvest)
library(stringr)
library(xml2)
page <- read_html("https://github.com/mrc-ide/global-lmic-reports/tree/master/data")
page %>%
html_nodes("a") %>% # find all links
html_attr("href") %>% # get the url
str_subset("\\.csv.zip") %>% # find those that end in .csv.zip
.[[1]] # look at the first one
Returns: [1] "/mrc-ide/global-lmic-reports/blob/master/data/2020-04-28_v1.csv.zip"
The question is what would be the code to identify the date of the latest .csv.zip file? E.g., 2021-05-22_v8.csv.zip as of checked on 2021-06-01.
The purpose is that if that date (i.e., 2021-05-22) is > latest update I have created in https://github.com/pourmalek/covir2 (e.g. IMPE 20210522 in https://github.com/pourmalek/covir2/tree/main/20210528), then a new update needs to be created.