0

I am trying to download a csv file from a website using selenium python and am having problems when conducting the actual download. While the file does download, it is supposed to be a csv file, but is instead showing up as an incomplete .tmp file (the real csv should have 50,000+ lines, whereas the .tmp file only has <100). When I download the file from the site manually, the proper and complete csv file is downloaded. Here is the code:

chromeDriver = config.get_prop('CHROME_DRIVER_PATH')

    chromeOpts = Options()
    prefs = {"download.default_directory":
                 "DESTINATION DIRECTORY (THIS WORKS)",
             }

    chromeOpts.add_experimental_option("prefs", prefs)

    driver = webdriver.Chrome(executable_path=chromeDriver, options=chromeOpts)

    driver.get("https://oasishub.co/login/?next=/downloads/b2a11100-eac5-4d10-869a-87ba064ede2d")

    usernameInput = driver.find_element_by_name("name")
    passwordInput = driver.find_element_by_name("password")
    usernameInput.send_keys("PROPER USERNAME (LEFT OUT)")
    passwordInput.send_keys("PROPER PASSWORD (LEFT OUT)")
    driver.find_element_by_xpath('//button[normalize-space()="Login"]').click()
    licenseAgreeButton = driver.find_element_by_name("agree")
    licenseAgreeButton.click()
    driver.find_element_by_xpath("//input[@value='Get the resource']").click()

Any help and/or ideas would be greatly appreciated! Thanks!

Alex88
  • 51
  • 1
  • 6

2 Answers2

1

Add a wait at the end of your code, so the selenium browser doesn't close immediately,

Driver.wait 30000

Or

Deem your variable for chromedriver outside of the scope, which will leave it open until you close it.

  • Hi - thanks for the input. It doesn't look like that works, as I think the root of the problem has to do with the download of the .tmp rather than the .csv in the first place. Any ideas on how to set Chrome preferences to download csv rather than tmp (I have seen other examples of people doing that for Firefox)? Thanks! – Alex88 Jun 24 '21 at 22:16
  • Found this Alex, the answer. May help https://stackoverflow.com/questions/46937319/how-to-use-chrome-webdriver-in-selenium-to-download-files-in-python/46938237#46938237 – Trystian May Jun 24 '21 at 22:24
0

You can define a function to wait for download. like in this topic: python selenium, find out when a download has completed?

from pathlib import Path

def is_download_finished(temp_folder):
    firefox_temp_file = sorted(Path(temp_folder).glob('*.part'))
    chrome_temp_file = sorted(Path(temp_folder).glob('*.crdownload'))
    downloaded_files = sorted(Path(temp_folder).glob('*.*'))
    if (len(firefox_temp_file) == 0) and \
       (len(chrome_temp_file) == 0) and \
       (len(downloaded_files) >= 1):
        return True
    else:
        return False 

If you know the name of the downloadfile after download, you can use listdir to ensure this file is inside the folder:

import os
import time

while file not in os.listdir(download_path):
    time.sleep(enough_time)

Ps: enough_time must be long enough to avoid wasting time sleeping and short enough to be as close as possible to the download end, either way it should not be too short to avoid running multiple times.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Renato Alves
  • 37
  • 1
  • 3