0

I am trying to scrape the weather table in this website (https://www.timeanddate.com/weather/canada/vancouver/historic?month=10&year=2017) for all days in October. I was successful to scrape the first day of October by the following code

library("rvest")
content<-read_html("https://www.timeanddate.com/weather/canada/vancouver/historic?month=10&year=2017") 
tables <- content %>% html_table(fill = TRUE)
tables[[2]]

I get the values that need to be changed every time in a drop-down menu to generate a new table corresponding to October 2,3,...

content %>%
html_nodes("#wt-his-select option")%>% html_attrs()

From similar questions, I understand that I need to use httr:POST or submit a form, but from here I have no clue how to get tables corresponding to oct 2,3,4,....

I tried this as well but seems like the drop-down menu I am trying to select options from is not a form as it does not show up here

html_form(content)

Furthermore, I cannot use "RSelenium" as I got an error (can't execute rsDriver (connection refused)) and to resolve that, I need to install Decker which I cannot for now due to windows problems. Any help would be greatly appreciated!

  • Duplicate? https://stackoverflow.com/questions/64471950/how-do-i-scrape-information-in-this-table-using-r/64474401#64474401 – QHarr Dec 26 '20 at 01:37
  • Thanks @QHarr! I missed this but again that also did not have an answer for the rest of days in a month. – curiousmind Dec 28 '20 at 19:40

1 Answers1

0

Follow the Network tab in the Dev tool, you can notice the page sends the request to a URL similar to this: https://www.timeanddate.com/scripts/cityajax.php?n=canada/vancouver&mode=historic&hd=20171011&month=10&year=2017&json=1

You can use jsonlite to extract the data from it.

xwhitelight
  • 1,569
  • 1
  • 10
  • 19