0

I'm trying to write R code that in a loop extracts the Daily Observations table over a few months from this website : https://www.wunderground.com/history/daily/KMESTEUB5/date/2020-6-1. However, I'm struggling to get the data from the web page. When reading the HTML of the webpage in R with code such as

url <- glue("https://www.wunderground.com/history/daily/KMESTEUB5/date/{year}-{month}-{day}")
html <- read_html(url)

the actual data is no where to be found among the HTML nodes. I used "Inspect" on the webpage in Chrome and found where in the HTML the table is. Shown Here. However, in the HTML code that R loads in, the nodes stop before the table, with the last one being the node for 'observation-title'. I am stuck as to why R does not load in all of the HTML. I have tried multiple functions to read the webpage, but all have ended with the same results. Any help would be greatly appreciated.

  • 1
    Have you checked these examples? - https://stackoverflow.com/search?q=%5Br%5D+wunderground – Ronak Shah Aug 25 '20 at 02:26
  • For websites that require javascript to be run, you'll want to use something like RSelenium to run a headless browser you can interact with. See also: blog.brooke.science/posts/scraping-javascript-websites-in-r. Alternatively open up your browser dev tools and see if you can find if the data is being pulled from a different URL and grab that directly – MrFlick Aug 25 '20 at 02:51

0 Answers0