0

I have been trying to download some files from a website using web scraping in R with this code:

url <- ("https://www.camara.leg.br/proposicoesWeb/prop_emendas?idProposicao=2261121&subst=0")

webpage <- read_html(url)

link <- webpage %>% html_nodes('.linkDownloadTeor') %>% html_attr("href") %>% paste('https://www.camara.leg.br/proposicoesWeb/', ., sep = "")

tot_links <- as.numeric(length(link))
vec <- data.frame(seq(1,tot_links)) 
vec <- setNames(vec,"indice")
vec$nome_arquivos <- paste("Emenda_",
                           vec$indice,
                           ".pdf",sep = "")

n=1
while (n<=tot_links) {
  try(download.file(link[n],destfile = vec$nome_arquivos[n],mode = "wb"))
  n=n+1
}

However, when I execute the code above I get the following error message:

Error in download.file(link[n], destfile = vec$nome_arquivos[n], mode = "wb") : cannot open URL 'https://www.camara.leg.br/proposicoesWeb/prop_mostrarintegra?codteor=1948041&filename=EMP+1+%3D%3E+PL+4372/2020' In addition: Warning message: In download.file(link[n], destfile = vec$nome_arquivos[n], mode = "wb") : InternetOpenUrl failed: '`}/âý'

This code worked when I used it in another website, so I do not understand why it is not working here.

Machavity
  • 30,841
  • 27
  • 92
  • 100

2 Answers2

0

This works for me :

mapply(download.file, link, vec$nome_arquivos)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

In tidyverse, we can use

library(purrr)
map2(link, vec$nome_arquivos, download.file)
akrun
  • 874,273
  • 37
  • 540
  • 662