0

I need to download a xls file from a website using chrome and selenium. There are multiple websites I need to go and so I need to open new tabs. However, when I open the second tab, I cannot download the file I need. Below are simple version my code. Image that I have just download some file from one tab and then open a new one using window.open():

import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
prefs = {'download.default_directory' : SAVE_PATH, "download.prompt_for_download": False}
options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(executable_path = DRIVE_PATH, chrome_options = options)

driver.execute_script("window.open('https://www.fhfa.gov/DataTools/Downloads/Pages/House-Price-Index-Datasets.aspx#mpo');") 
time.sleep(5)
driver.switch_to.window(driver.window_handles[1]) 
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//*[@id='WebPartWPQ2']/div[1]/table[3]/tbody/tr[2]/td[2]/p/a"))).click()

Without opening new tab, I could download the file successfully. But after opening new tab, chrome tells me "Fail - Download error". Something wrong with my code?

wwj123
  • 365
  • 2
  • 12

2 Answers2

2

MacOS, Chrome Version 76.0.3809.100, ChromeDriver version 75.0.3770.140 in download success in both ways.
To locate download link better to use below, you find more information about locator strategies here

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[href*='HPI_PO_summary.xls']"))).click()

Faster way is to use to download files from https://www.fhfa.gov/, here's example:

import requests
import os

file_name = "HPI_PO_summary.xls"
response = requests.get(f'https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/{file_name}')

with open(os.path.join(SAVE_PATH, file_name), 'wb') as f:
    f.write(response.content)
Sers
  • 12,047
  • 2
  • 12
  • 31
  • Thanks Sers. the `requests` approach works well. On the selenium approach, it seems switching from XPATH to CSS_SELECTOR doesn't solve it -- still have the same error message. I am using the most up to date versions. Not sure if it is some certain Chrome settings that invoke the error message? – wwj123 Aug 12 '19 at 21:59
0

Answering my question in here:

It seems the issue is in the SAVE_PATH. Initially my SAVE_PATH was:

r"C:\Users\hw\Desktop\myfile\"

And for some reason it works (based on the answer here) if I add one more slash to the end of the path:

r"C:\Users\hw\Desktop\myfile\\"
wwj123
  • 365
  • 2
  • 12