-1

I am attempting to web scrape using the PyQT5 QWebEngineView. Here is the code that I got from another response on StackOverflow:

from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl, QEventLoop
from PyQt5.QtWebEngineWidgets import QWebEngineView
import sys

def render(url):
    class Render(QWebEngineView):
        def __init__(self, t_url):
            self.html = None
            self.app = QApplication(sys.argv)
            QWebEngineView.__init__(self)
            self.loadFinished.connect(self._loadfinished)
            self.load(QUrl(t_url))
            while self.html is None:
                self.app.processEvents(QEventLoop.ExcludeUserInputEvents | QEventLoop.ExcludeSocketNotifiers | QEventLoop.WaitForMoreEvents)
            self.app.quit()

        def _callable(self, data):
            self.html = data

        def _loadfinished(self, result):
            self.page().toHtml(self._callable)

    return Render(url).html

Then if I put the line:

print(render('http://quotes.toscrape.com/random'))

it works as expected. But if I add a second line to that so it reads:

print(render('http://quotes.toscrape.com/random'))
print(render('http://quotes.toscrape.com/tableful/'))

it gives me the error "Process finished with exit code -1073741819 (0xC0000005)" after printing out the first render correctly.

I have narrowed the error down to the line that says self.load(QUrl(t_url))

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
boymeetscode
  • 805
  • 1
  • 9
  • 26

1 Answers1

2

You're initializing QApplication more than once. Only once instance should exist, globally. If you need to get the current instance and do not have a handle to it, you can use QApplication.instance(). QApplication.quit() is meant to be called right before sys.exit, in fact, you should almost never use one without the other.

In short, you're telling Qt you're exiting the application, and then trying to run more Qt code. It's an easy fix, however...

Solution

You can do 1 of three things:

Store the app in a global variable and reference it from there:

APP = QApplication(sys.argv)
# ... Many lines ellipsed

class SomeClass(QWidget):
    def some_method(self):
        APP.processEvents(QEventLoop.ExcludeUserInputEvents | QEventLoop.ExcludeSocketNotifiers | QEventLoop.WaitForMoreEvents)

Pass the app as a handle to the class.

def render(app, url):
    ...

Create a global instance, and use QApplication.instance().

APP = QApplication(sys.argv)
# ... Many lines ellipsed

class SomeClass(QWidget):
    def some_method(self):
        app = QApplication.instance()
        app.processEvents(QEventLoop.ExcludeUserInputEvents | QEventLoop.ExcludeSocketNotifiers | QEventLoop.WaitForMoreEvents)

Do what's most convenient for you.

Alex Huszagh
  • 13,272
  • 3
  • 39
  • 67
  • Thanks! This worked! I ended up going with the 3rd option as it seemed to be the least amount of editing of my original code. – boymeetscode Oct 22 '18 at 12:28