1

so i want to make auto download when i got some link, let say the link is : http://test.com/somefile.avi

import os
import sys
from PyQt5.QtWidgets import QApplication, QVBoxLayout, QWidget, QWidgetAction
from PyQt5.QtCore import QUrl, QEventLoop
from PyQt5.QtWebEngineWidgets import QWebEngineView, QWebEngineProfile, QWebEngineDownloadItem, QWebEnginePage


class WebPage(QWebEngineView):
    def __init__(self):
        QWebEngineView.__init__(self)
        self.load(QUrl("http://test.com"))
        self.loadFinished.connect(self._on_load_finished)
        self.n = 0

    def _on_load_finished(self):
        print("Finished Loading")
        self.page().toHtml(self.Callable)

    def Callable(self, html_str):
        self.html = html_str
        self.load(QUrl(userInput))

if __name__ == "__main__":
    userInput = input()
    app = QApplication(sys.argv)
    web = WebPage()

except i only have the page 'test.com', but i cant get the file 'somefile.avi', is it possible to make it autodownload after i input the 'http://test.com/somefile.avi' in console?

Thanks

pilip h
  • 13
  • 1
  • 3
  • I would say yes. However, what have you tried so far? – Pax Vobiscum Apr 30 '18 at 08:01
  • i try to read qt the docs at : https://doc.qt.io/qt-5/qtwebenginewidgets-module.html, it mention something about download, but i dont know how to make it works, i'm a beginner python actually. – pilip h Apr 30 '18 at 08:05
  • You could use the `requests` library. Not Qt, but it works. – Pax Vobiscum Apr 30 '18 at 08:09
  • thanks, could you please write the example code with requests? oh, note that i have auto proxy setting in my office, so my workaround is using qt web engine – pilip h Apr 30 '18 at 08:11
  • Hey, so I added some example code I had lying around from another answer I gave last week, and modified it some to account for your proxy set up. Also note the disclaimer about `PyQt` and `requests` – Pax Vobiscum Apr 30 '18 at 08:21

1 Answers1

1

Below is a code snippet of how to do this with the requests library

DISCLAIMER

This example was made with requests, python 3rd party library, and not with PyQt as the asker originally intended.

import requests
import shutil

def download(url):

    # gets the filename from the url, and
    # creates the download file absolute path
    filename = url.split("/")[-1]
    path = "downloads/" + filename

    # Defines relevant proxies, see `requests` docs
    proxies = {
      'http': 'http://10.10.1.10:3128',
      'https': 'http://10.10.1.10:1080',
    }

    # Add proxies, and leave `stream=True` for file downloads
    r = requests.get(url, stream=True, proxies=proxies)
    if r.status_code == 200:
        with open(path, 'wb') as f:
            r.raw.decode_content = True
            shutil.copyfileobj(r.raw, f)
    else:
        # Manually raise if status code is anything other than 200
        r.raise_for_status()


download('http://test.com/somefile.avi')

Edit:

pac files do not work out of the box with any of the common python web request libraries, however, SO user @CarsonLam provided an answer here that attempts to solve this issue.

The library pypacprovides support for this, and since it inherits from requests objects, it would macigally work with our existing code. Some additional pac examples can be found here.

With a pac proxy file, I would guess something like this would be the way to go;

from pypac import PACSession, get_pac
import shutil

def download(url):

    # gets the filename from the url, and
    # creates the download file absolute path
    filename = url.split("/")[-1]
    path = "downloads/" + filename

    # looks for a pac file at the specified url, and creates a session
    # this session inherits from requests.Session
    pac = get_pac(url='http://foo.corp.local/proxy.pac')
    session = PACSession(pac)

    # Add proxies, and leave `stream=True` for file downloads
    session = requests.get(url, stream=True)
    if r.status_code == 200:
        with open(path, 'wb') as f:
            r.raw.decode_content = True
            shutil.copyfileobj(r.raw, f)
    else:
        # Manually raise if status code is anython other than 200
        r.raise_for_status()
Pax Vobiscum
  • 2,551
  • 2
  • 21
  • 32
  • thanks a lot, however how can i set the auto proxy file (my company use .pac file) i got error : ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))) – pilip h Apr 30 '18 at 08:37
  • Wow haha, now you really put me in the spot ;) hang on! – Pax Vobiscum Apr 30 '18 at 08:44
  • Okay, so I updated my answer, feel free to get back with how it went! Also big props to @CarsonLam for providing this. – Pax Vobiscum Apr 30 '18 at 08:49
  • so in this line : session = requests.get(url, stream=True, proxies=proxies), i try proxies to pac, it got error. could you please help me on this? – pilip h Apr 30 '18 at 09:11
  • Yeah sure, it shouldn't use `proxies=proxies`. I updated my answer, have a look. – Pax Vobiscum Apr 30 '18 at 09:13
  • 1
    it works on local net, cool, however not on world site : i got error requests.exceptions.HTTPError: 407 Client Error: Proxy Authentication Required for url: http://google.com/, thanks alot. it would be great if it can be bypassed. note that i already set auth following this: http://pypac.readthedocs.io/en/latest/user_guide.html#proxy-authentication – pilip h Apr 30 '18 at 09:30
  • Look the documentation of `pypac`, there are some options on how to authenticate with a pac proxy. You are welcome! – Pax Vobiscum Apr 30 '18 at 09:31
  • ah, yes, i got it, but not work, but thanks anyway :) – pilip h Apr 30 '18 at 09:32