0

I am currently working on a project where I need to extract many files from a database, for which there is no API. I need to do it through a webpage by constructing URL's similar to this one:

https://bmsnet.cas.dtu.dk/Trendlogs/ExportCSV_TrendlogRecordData/1

The integer at the end of the URL (in the example above: 1), will be ranging from 1 to 35000. When constructing the URL, I get a pop-up windows for saving the file such as:

Pop-up window for file download

My question is how do I automate that process using python. I am capable of generating these URLs and handle the data resulting from the file download (so far when doing this manually). The step I am stuck at, is for constructing a python command/bit of code that allows me to click on the save as button. Eventually I want to end up with a code doing the following:

  • Construct the URL
  • Save the file arising from the pop-up window
  • Load/read and process the data

EDIT :

I have now found a solution using Selenium.

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pyautogui
import time
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile


dl_path = "MY_LOCAL_DOWNLOAD_PATH"
profile = FirefoxProfile()
profile.set_preference("browser.download.folderList", 2)
profile.set_preference("browser.download.manager.showWhenStarting", False)
profile.set_preference("browser.download.dir", dl_path)
profile.set_preference("browser.helperApps.neverAsk.saveToDisk",
                          "text/plain,text/x-csv,text/csv,application/vnd.ms-excel,application/csv,application/x-csv,text/csv,text/comma-separated-values,text/x-comma-separated-values,text/tab-separated-values,application/pdf")


driver = webdriver.Firefox(firefox_profile=profile)

URL = "https://bmsnet.cas.dtu.dk"

driver.get(URL)
# Let the page load
time.sleep(5)

username = driver.find_element_by_id("Email")
password = driver.find_element_by_id("Password")

username.send_keys("my_username")
password.send_keys("my_password")


elem = driver.find_element_by_xpath("/html/body/div[2]/div/div[1]/section/form/div[4]/div/input")
elem.click()

time.sleep(5)

start = 1
stop = 10

for file_integer in range(start, stop):


    URL = "https://bmsnet.cas.dtu.dk/Trendlogs/ExportCSV_TrendlogRecordData/{0}".format(file_integer)
    driver.get(URL)
    time.sleep(5)
    print('Done downloading integer: {0}'.format(file_integer))

The above code works but only once. For some reason the for loop gets stuck after the first iteration. Any clue on what I am doing wrong there?

Thank you for your time and help. Looking forward to hearing your ideas on that.

Thibaut
  • 11
  • 3
  • does this answer your question https://stackoverflow.com/questions/24346872/python-equivalent-of-a-given-wget-command, you can use urlib to download file from URL. – Rahul Nov 02 '20 at 08:55
  • This does not seem to work. The URL does not point to the file. Instead the URL prompts a pop-up window from which I have to select "Save As" and then click OK. I have edited my original post with my latest progress using Selenium. But now I am stuck at the looping phase. I can do what I want but just once ... – Thibaut Nov 02 '20 at 11:17

0 Answers0