How can I change this code so R can properly download the files from the website in the code?

Question

I have been trying to download some files from a website using web scraping in R with this code:

url <- ("https://www.camara.leg.br/proposicoesWeb/prop_emendas?idProposicao=2261121&subst=0")

webpage <- read_html(url)

link <- webpage %>% html_nodes('.linkDownloadTeor') %>% html_attr("href") %>% paste('https://www.camara.leg.br/proposicoesWeb/', ., sep = "")

tot_links <- as.numeric(length(link))
vec <- data.frame(seq(1,tot_links)) 
vec <- setNames(vec,"indice")
vec$nome_arquivos <- paste("Emenda_",
                           vec$indice,
                           ".pdf",sep = "")

n=1
while (n<=tot_links) {
  try(download.file(link[n],destfile = vec$nome_arquivos[n],mode = "wb"))
  n=n+1
}

However, when I execute the code above I get the following error message:

Error in download.file(link[n], destfile = vec$nome_arquivos[n], mode = "wb") : cannot open URL 'https://www.camara.leg.br/proposicoesWeb/prop_mostrarintegra?codteor=1948041&filename=EMP+1+%3D%3E+PL+4372/2020' In addition: Warning message: In download.file(link[n], destfile = vec$nome_arquivos[n], mode = "wb") : InternetOpenUrl failed: '`}/âý'

This code worked when I used it in another website, so I do not understand why it is not working here.

It might have to do with http vs https, e.g. see: https://stackoverflow.com/a/33372798/12957340 — jared_mamrot, Dec 09 '20 at 00:24

score 0 · Accepted Answer · answered Dec 09 '20 at 03:31

0

This works for me :

mapply(download.file, link, vec$nome_arquivos)

answered Dec 09 '20 at 03:31

Ronak Shah

377,200
20
156
213

score 0 · Answer 2 · answered Dec 09 '20 at 22:06

0

In tidyverse, we can use

library(purrr)
map2(link, vec$nome_arquivos, download.file)

answered Dec 09 '20 at 22:06

akrun

874,273
37
540
662

How can I change this code so R can properly download the files from the website in the code?

2 Answers2