I have tried every method I can think of for getting the pdf from the link: http://apps.colorado.gov/dora/licensing/Lookup/LicenseLookup.aspx?docExternal=926241&docGuid=8DC9BB72-A921-45E7-9BCD-358846FCE54D
I have tried:
- Clicking the button for this link
- Opening the href manually in the webdriver
- Using WebDriverWait, and various commands to wait for url switches or the appearance of certain urls
- Sleeping and re-getting the page_source
- Using a try statement to override the TimeOut exception and trying to issue more commands from there
Every attempt at opening this link results in a timeout exception, even though it works just fine manually.
It looks like it runs through 2(?) redirects before landing on the pdf file I'd like to grab. Is there anyone out there with selenium experience that can point me in the right direction for getting this pdf? I'm running Selenium on ChromeDriver in a Python script.
ANSWER:
download_buttons = self.browser.find_elements_by_link_text("External Document")
for button in download_buttons:
new_file_path = f'{blah}.pdf'
link = button.get_attribute("href")
download_link = requests.get(link, allow_redirects=True)
try:
with open(new_file_path, 'wb') as new_file:
new_file.write(download_link.content)
except Exception as e:
self.print_error(f"Failed to write file: {e}")