0

I'm trying to download a PDF file from this address: https://aisweb.decea.mil.br/inc/notam/gerar-boletim/reports/report-notam.cfm

I wrote some code that first fills out some information in this page (correctly) https://aisweb.decea.mil.br/?i=notam and then clicks a button that opens a new tab to the generated PDF file. The problem is that when it tries to save the PDF file at the end, it downloads directly from the .cfm address, resulting in an empty PDF template (you can see this by clicking the fist link).

How can I download the PDF that is currently being shown to me on the page, instead of accessing the first URL directly?

This is my code

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver import ActionChains
from selenium.webdriver.common.print_page_options import PrintOptions
from urllib import request
from bs4 import BeautifulSoup
import re
import os
import urllib
import time
import requests
from urllib.parse import urljoin

aerodromos = "SBNT,SBJP,SBFZ,SBRF"                          #TEST

driver = webdriver.Chrome('C:\Windows\chromedriver.exe')
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options)

driver.get("https://aisweb.decea.mil.br/?i=notam")
driver.maximize_window()
caixaTexto = driver.find_element(By.XPATH,'//*[@id="icaocode"]')
caixaTexto.send_keys(aerodromos)

botao = driver.find_element(By.XPATH, '//*[@id="a"]/form/div/div[3]/div/input[2]')
botao.click()

botao = driver.find_element(By.XPATH, '//*[@id="select-all"]')
botao.click()

botao = driver.find_element(By.XPATH, '/html/body/div/div/div/div/div[2]/div/div/form/input[3]')
botao.click()

response = urllib.request.urlretrieve('https://aisweb.decea.mil.br/inc/notam/gerar-boletim/reports/report-notam.cfm', filename='relatorio1.pdf')
  • If I understand your question, you want to download the file directly? Just do a search for 'download files with selenium' - it basically entails setting the browser to download the pdfs, not display them. – Barry the Platipus Sep 14 '22 at 19:14
  • @BarrythePlatipus I tried disabling the chrome PDF viewer and the download prompt so that when it clicked the button the file would download instead of open, but nothing changed, it still opened the file online. – Alexandre Viegas Sep 14 '22 at 19:24
  • Did you try with a firefox/geckodriver setup? Chrome can be temperamental with these aspects. Give it a go with Firefox. – Barry the Platipus Sep 14 '22 at 19:34
  • I did it, thanks! Firefox also didn't work but i found a way to do it in Chrome. – Alexandre Viegas Sep 14 '22 at 22:25

1 Answers1

0

I did it! When I tried to change the settings in Chrome to download PDFs instead of opening them, it made no difference, but I ended up finding a solution while searching for another way to do it.

Unable to access the modal elements to download pdf with selenium

I changed Chrome experimental options profile in my code and it worked! Now it opens the tab, immediately downloads the file and closes the tab!