I've been trying to batch download pdfs from a list of urls. Sadly, each of these urls are actually a visualisation of the pdf and have a download button on them and I can't figure out how to get them.
When I did this for a different website, I used this code (now with some of the links I need):
urls <- c("https://dom-web.pbh.gov.br/visualizacao/edicao/2714",
"https://dom-web.pbh.gov.br/visualizacao/edicao/2714",
"https://dom-web.pbh.gov.br/visualizacao/edicao/2716",
"https://dom-web.pbh.gov.br/visualizacao/edicao/2718",
"https://dom-web.pbh.gov.br/visualizacao/edicao/2720",
"https://dom-web.pbh.gov.br/visualizacao/edicao/2721")
names = c("DECRETO Nº 17.297.pdf",
"DECRETO Nº 17.298.pdf",
"DECRETO Nº 17.304.pdf",
"DECRETO Nº 17.308.pdf",
"DECRETO Nº 17.309.pdf",
"DECRETO Nº 17.313.pdf")
for (i in 1:length(urls)){
download.file(urls[i], destfile = names[i], mode = 'wb')
}
For another website, this led to nice pdfs being downloaded to my working directory. This one is just empty ones. I've tried the solutions from [https://stackoverflow.com/questions/36359355/r-download-pdf-embedded-in-a-webpage] and [https://stackoverflow.com/questions/42468831/how-to-set-up-rselenium-for-r], but I continue to fail miserably.
If anyone has a lightbulb moment and can help me out, that would be the bee's knees.