1

I would like to use selenium in Python to automate the downloading process. But the current problem is that I can actually get to the right url (where the pdf file is located) using the xpath, but I cannot download the files because of the OS dial box. I found some solutions suggesting the use of webdriver.FirefoxProfile().set.preference. However, since I need to click on the website several times using selenium to get to the right page, I cannot set the url directly with set.preference at the beginning of the program. Could you help me to integrate the set.preference to my existing program? Thank you very much!!

PS. as you can see the website needs authentication.

Here is my current code:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import unittest
import os




class LoginTest(unittest.TestCase):
 def setUp(self):
    self.driver=webdriver.Firefox()
    self.driver.get("myinitialurl")

def test_Login(self):
    driver=self.driver

    emailFieldID="userNameInput"

    passFieldID="passwordInput"
    loginButtonID="submitButton"
    BBButton="(//a[contains(@href,'blackboard')])"
    coursebutton="(//a[contains(@href,'Course&id=_4572_1&url')])[1]"

    docbutton="(//a[contains(@href,'content_id=_29867_1')])"
    conbutton="(//a[contains(@href,'content_id=_29873_1')])"
    paperbutton="(//a[contains(@href,'/xid-26243_1')])"

    emailFieldElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(emailFieldID))

    passFieldElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(passFieldID))
    loginButtonElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(loginButtonID))

    emailFieldElement.clear()
    emailFieldElement.send_keys("username")
    passFieldElement.clear()    
    passFieldElement.send_keys("password")
    loginButtonElement.click()
    BBElement=WebDriverWait(driver,50).until(lambda driver:driver.find_element_by_xpath(BBButton))
    BBElement.click()
    WebDriverWait(driver, 50).until(lambda driver: len(driver.window_handles) == 2)

    window_after = driver.window_handles[1]
    driver.switch_to.window(window_after)
    courseElement=WebDriverWait(driver,50).until(lambda driver:driver.find_element_by_xpath(coursebutton))
    courseElement.click()

After that normally I should open a pdf file on the website and a dial box. I would like to download the file.

the code of set.preference that I found is as follows:

fp = webdriver.FirefoxProfile()

fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir", os.getcwd())
fp.set_preferenc("browser.helperApps.neverAsk.saveToDisk", "application/pdf")

browser = webdriver.Firefox(firefox_profile=fp)
browser.get("url")
browser.find_element_by_partial_link_text("button").click()

So my question is how to integrate the second part in the first code so as to download the content triggered by the last click.

Otherwise, do you have other easier solutions??

Thank you very much!

SXC88
  • 227
  • 1
  • 5
  • 16
  • Explain what you mean `I cannot set the url directly with set.preference at the beginning of the program`? There is no need to specify any `URLs` in your preferences, but default folder for the file to download and it's `MIME` type. You can use `driver.get(URL)` wherever you want in your code and as many times as you need – Andersson Nov 10 '16 at 10:43
  • Thank you for your answer. Yes, I just realized it. But what I still don't understand is how to integrate the set.preference code to the first part so as to download the content displayed after the last click (a pdf file) – SXC88 Nov 10 '16 at 10:53
  • You can add `Profile` definition to `setUp` and use `self.driver=webdriver.Firefox(firefox_profile=fp)` instead of `self.driver=webdriver.Firefox()` – Andersson Nov 10 '16 at 10:55
  • Yes I just tried it and combined the two parts, but there was still an OS dial box of being displayed...and I was asked to choose to save the file... – SXC88 Nov 10 '16 at 12:31
  • I put my entire code below – SXC88 Nov 10 '16 at 12:31
  • Check `MIME`-type (`Content-type`) in response for your `GET`-request with `F12` -> `Network` while you're downloading file manually. Is it actually `application/pdf`? There are few `MIME`-types for `PDF` files – Andersson Nov 10 '16 at 14:14
  • Yes, it shows type="application/pdf" when I inspect the element ;( – SXC88 Nov 10 '16 at 14:47

3 Answers3

1
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import unittest
import os




class LoginTest(unittest.TestCase):
 def setUp(self):

    fp=webdriver.FirefoxProfile()

    fp.set_preference("browser.download.folderList",2)
    fp.set_preference("browser.download.manager.showWhenStarting",False)
    fp.set_preference("browser.download.dir", "D://doc")
    fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/pdf")

    self.driver=webdriver.Firefox(firefox_profile=fp)
    self.driver.get("myurl")

def test_Login(self):
    driver=self.driver

    emailFieldID="userNameInput"

    passFieldID="passwordInput"
    loginButtonID="submitButton"
    BBButton="(//a[contains(@href,'blackboard')])"
    coursebutton="(//a[contains(@href,'Course&id=_4572_1&url')])[1]"

    docbutton="(//a[contains(@href,'content_id=_29867_1')])"
    conbutton="(//a[contains(@href,'content_id=_29873_1')])"
    paperbutton="(//a[contains(@href,'/xid-26243_1')])"

    emailFieldElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(emailFieldID))

    passFieldElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(passFieldID))
    loginButtonElement=WebDriverWait(driver,10).until(lambda driver:driver.find_element_by_id(loginButtonID))

    emailFieldElement.clear()
    emailFieldElement.send_keys("username")
    passFieldElement.clear()    
    passFieldElement.send_keys("password")
    loginButtonElement.click()
    BBElement=WebDriverWait(driver,50).until(lambda driver:driver.find_element_by_xpath(BBButton))
    BBElement.click()
    WebDriverWait(driver, 50).until(lambda driver: len(driver.window_handles) == 2)

    window_after = driver.window_handles[1]
    driver.switch_to.window(window_after)
    courseElement=WebDriverWait(driver,50).until(lambda driver:driver.find_element_by_xpath(coursebutton))
    courseElement.click()
SXC88
  • 227
  • 1
  • 5
  • 16
1

Try to add two more preferences that you might need to download PDF file:

fp.set_preference("pdfjs.disabled", True)
fp.set_preference("plugin.disable_full_page_plugin_for_types", "application/pdf")
Andersson
  • 51,635
  • 17
  • 77
  • 129
  • That's awesome! It worked! Thanks a lot! But I still have another question about disabling the firefox OS popups, but I will ask it in a new post ;) – SXC88 Nov 10 '16 at 15:26
  • Here is the post if you want to have a look :)) – SXC88 Nov 10 '16 at 15:39
0
import requests
audio_src = driver.find_element_by_tag_name('audio').get_property('src')
response = requests.get(audio_src, cookies={i['name']: i['value'] for i in driver.get_cookies()})
with open('f.mp3', 'wb') as f:
    f.write(response.content)
admin
  • 169
  • 1
  • 6