5

I've been encountering a problem with PyQt5's QWebEngineUrlRequestInterceptor on Python3 and, more importantly, the setHttpHeader function. Here's my code:

class WebEngineUrlRequestInterceptor(QWebEngineUrlRequestInterceptor):
def __init__(self, parent=None):
    super().__init__(parent)

def interceptRequest(self, info):
    info.setHttpHeader("X-Frame-Options", "ALLOWALL")
    print(info.requestUrl())

Unfortunately, the proper way to use this function seems to be located absolutely nowhere and as such I have had to resort to trying every possible method I can think of, to no avail.

I have also tried surrounding the arguments of setHttpHeader with QByteArray, which caused QByteArray to give me this complaint...

    Traceback (most recent call last):
  File "test.py", line 30, in interceptRequest
    info.setHttpHeader(QByteArray("X-Frame-Options"), QByteArray("ALLOWALL"))
TypeError: arguments did not match any overloaded call:
  QByteArray(): too many arguments
  QByteArray(int, str): argument 1 has unexpected type 'str'
  QByteArray(Union[QByteArray, bytes, bytearray]): argument 1 has unexpected type 'str'

I have also tried encoding the strings with .encode('ascii') and even .encode('utf-8'). While neither raised an error, the header also refused to change, which leads me to believe that the returned values are not compatible with the function.

UPDATE: Even QByteArray(b"X-Frame-Options") does not set the header. js: Refused to display 'https://www.google.co.uk/?gfe_rd=cr&dcr=0&ei=rX2gWtDJL8aN8Qfv3am4Bw' in a frame because it set 'X-Frame-Options' to 'SAMEORIGIN'. is the error I get from WebEngine.

A note to add, I am 100% sure that interceptRequest is being called. I can see the output of the print call in the terminal.

Full MCVE code at [UPDATED LINK]: https://paste.ee/p/Y0mRs

  • Are you trying to display Google in a IFRAME? Have you made sure that then intercept happens for the google request or the main page request? Also your hastebin link doesn't seem to work – Tarun Lalwani Apr 22 '18 at 00:16
  • The link of your MCVE has been dropped. – eyllanesc Apr 22 '18 at 03:17
  • @TarunLalwani Yes, I am trying to display an external webpage in an iframe. I have indeed made sure that the intercept happens for the iframe content, as my full code prints the URL of any request it has intercepted. Also, I have updated the Hastebin link in my post. –  Apr 22 '18 at 13:07
  • So I know what the issue is but I am not sure if a solutions exists for `QWebEngineView`, can you use a `QWebView` instead? – Tarun Lalwani Apr 23 '18 at 08:17
  • Sorry @TarunLalwani, but I cannot use QWebView to complete the project I am working on as it would require my application to be modified significantly. Somehow I need to get setHttpHeader working on QWebEngine. –  Apr 23 '18 at 15:31
  • Fine, let me post an answer with what I have, I am not sure if it is possible or not. I do have potential approach for you to explore – Tarun Lalwani Apr 23 '18 at 15:33

1 Answers1

5

So first of all, the question is why is the existing code not working?

class WebEngineUrlRequestInterceptor(QWebEngineUrlRequestInterceptor):
    def __init__(self, parent=None):
        super().__init__(parent)

    def interceptRequest(self, info):
        info.setHttpHeader("X-Frame-Options", "ALLOWALL")
        print(info.requestUrl())

Now when you install a UrlRequestInterceptor, it is by all means a request interceptor. A request initiated by the WebEngineView is passed through this, you can do a lot with it

  • Change the URL all together
  • Block it from downloading (AdBlocking etc...)
  • Add more headers to request
  • Redirect to a different url

Now when you have info.setHttpHeader("X-Frame-Options", "ALLOWALL"), it is adding it to the request and not to the response. This can be verified by changing the url to http://postman-echo.com/get and you will get the below response

{
  "args": {
    
  },
  "headers": {
    "host": "postman-echo.com",
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
    "accept-encoding": "gzip, deflate",
    "cookie": "sails.sid=s%3AXNukTzCE5ucYNEv_NB8ULCf4esVES3cW.%2BmpA77H2%2F%2B6YcnypvZ7I8RQFvVJrdOFs8GD%2FPymF0Eo",
    "if-none-match": "W/\"1e1-rYSDjZun8qsI1ZojoxMuVg\"",
    "upgrade-insecure-requests": "1",
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.10.1 Chrome/61.0.3163.140 Safari/537.36",
    "x-frame-options": "ALLOW",
    "x-forwarded-port": "80",
    "x-forwarded-proto": "http"
  },
  "url": "http://postman-echo.com/get"
}

But nothing is changed on the response side, you still have whatever was actually returned by the original request.

With QWebView it was possible to install a QNetworkAccessManager and return a QNetworkReply with a modified response. Something shown in

How to write content to QNetworkReply (was: QWebview with generated content)

But if you read the Porting from Qt WebKit to Qt WebEngine guide, there is a important difference to note

Qt WebEngine Does Not Interact with QNetworkAccessManager

Some classes of Qt Network such as QAuthenticator were reused for their interface but, unlike Qt WebKit, Qt WebEngine has its own HTTP implementation and cannot go through a QNetworkAccessManager.

The signals and methods of QNetworkAccessManager that are still supported were moved to the QWebEnginePage class.

I dug lot of thread asking for a response modification approach. All un-answered unfortunately

Capture server response with QWebEngineView

QWebEngineView modify web content before render

https://forum.qt.io/topic/81450/capture-client-request-headers-with-qwebengineview

Intercept AJAX POST request and read data using QWebEngine?

So it is not easily possible. But there is one workaround that I think would work, but I am not able to yet validate it

The approach is to add a new scheme url handler

self.conn_handler = AppSchemeHandler()
self.profile.installUrlSchemeHandler("conapp".encode(), self.conn_handler)
self.webpage = MyQWebEnginePage(self.profile, self.view)

Now we update the interceptor so it modifies the google url to redirect the request to our handler

class WebEngineUrlRequestInterceptor(QWebEngineUrlRequestInterceptor):
    def __init__(self, parent=None):
        super().__init__(parent)

    def interceptRequest(self, info):
        info.setHttpHeader(b'x-frame-options', b'ALLOW')
        print(info.requestUrl())

        if str(info.requestUrl().host()) == "google.com":
            url = info.requestUrl().toString()
            item = url.split("/")[-1]

            info.redirect(QUrl(r"conapp://webresource?url=" + url))

And then in our scheme handler

class AppSchemeHandler(QWebEngineUrlSchemeHandler):
    def __init__(self, parent=None):
        super().__init__(parent)

    def requestStarted(self, request):
        url = request.requestUrl().toString().replace("conapp://webresource?url=", "")
        response = QWebEngineHttpRequest(QUrl(url))

        # Do something here which returns the response back to the url

The part where we read the response and send it back is something I have not found an example of yet anywhere

Community
  • 1
  • 1
Tarun Lalwani
  • 142,312
  • 9
  • 204
  • 265
  • For the reply part this url may help https://fossies.org/linux/eric6/eric/WebBrowser/Network/QtHelpSchemeHandler.py – Tarun Lalwani Apr 23 '18 at 15:54
  • Thanks for all of your help. I'll just quickly see if I can throw together a solution for the reply and get back to you on it- I can't believe I didn't realize I was working with requests instead of responses! –  Apr 23 '18 at 16:38
  • Unfortunately, the solution has not worked for me. I discovered that there is a problem with using info.redirect for this task- whatever is added to the redirect URL will also be used to calculate the next relatively defined URL (e.g ./assets/my_image.png). This would be much easier if there was some way to use the RequestInterceptor to redirect without causing Chromium to change the way it converts relative URLs to absolute URLs. I guess I'll have to wait until the Qt developers actually add something like the NetworkAccessManager (except for QWebEngine). –  Apr 24 '18 at 18:03
  • @unknownA, I doubt that would happen because they move to chrome Stack and a network access manager would mean something that sits inside chrome. But you can ask on their forum and see if someone from the maintainer team could pitch in and comment on the same. But as I suspected this may not be possible with `QWebEngine` – Tarun Lalwani Apr 24 '18 at 18:06
  • @unknownA, I don't think there is a fully answer to this. It would be good if you can instead of accepting just award the bounty points if the answer helped you out – Tarun Lalwani Apr 28 '18 at 19:55