I'm working on scripting some dataset downloads in R from the Center for Survey and Survey/Registrar data, this nesstar-based data archive: http://cssr.surveybank.aau.dk/webview
Poking around, I've found there are bookmarkable links for each dataset in each format, e.g., http://cssr.surveybank.aau.dk/webview/velocity?format=STATA&includeDocumentation=on&execute=&ddiformat=pdf&study=http%3A%2F%2F172.18.36.233%3A80%2Fobj%2FfStudy%2FElectionStudy-1973&analysismode=table&v=2&mode=download
There's no username or password required to use the site, so that's one bullet dodged. But the next step is to click on the "Download" button, and that's where I'm stumped. This question Using R to "click" a download file button on a webpage sounds like it should be right on, but this webpage actually isn't similar. Unlike that one, this button is not part of a form, so my efforts using html_form()
and submit_form()
predictably got nowhere. (And it's not a link, so of course follow_link()
won't work either.) The following gets me to the right node, but doesn't actually click the button.
library(magrittr)
library(rvest)
url <- "http://cssr.surveybank.aau.dk/webview/velocity?format=STATA&includeDocumentation=on&execute=&ddiformat=pdf&study=http%3A%2F%2F172.18.36.233%3A80%2Fobj%2FfStudy%2FElectionStudy-1973&analysismode=table&v=2&mode=download"
s <- html_session(url)
download_button <- s %>% html_node(".button")
Now that RSelenium is back on CRAN (yay!), I suppose I could go in that direction instead, but I'd really prefer an rvest
or httr
-based solution. If anyone could help, I'd really appreciate it.