1

I am new to Python and am trying to understand why I am getting the following error:

Traceback (most recent call last):
File "WebScraper.py", line 10, in <module>
    class Render(QWebPage):  
NameError: name 'QWebPage' is not defined

Here is the code:

import sys  
from PyQt5.QtGui import *  
from PyQt5.QtCore import *  
from PyQt5.QtWebEngineWidgets import *  
from lxml import html 

#Take this class for granted.Just use result of rendering.
class Render(QWebPage):  
  def __init__(self, url):  
    self.app = QApplication(sys.argv)  
    QWebPage.__init__(self)  
    self.loadFinished.connect(self._loadFinished)  
    self.mainFrame().load(QUrl(url))  
    self.app.exec_()  

  def _loadFinished(self, result):  
    self.frame = self.mainFrame()  
    self.app.quit()  

url = 'http://pycoders.com/archive/'  
r = Render(url)  
result = r.frame.toHtml()
#This step is important.Converting QString to Ascii for lxml to process
archive_links = html.fromstring(str(result.toAscii()))
print(archive_links)

I understand the __init__ acts as a constructor but why is it not setting it to self? So I need to change it to something like QWebPage.x = self?

user081608
  • 1,093
  • 3
  • 22
  • 48

2 Answers2

1

You're not importing QWebPage

Try adding this import to the top of your script:

from PyQt5.QtWebKitWidgets import QWebPage
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Jack
  • 20,735
  • 11
  • 48
  • 48
  • 1
    Ive tried that but get this `ModuleNotFoundError: No module named 'PyQt5.QtWebKitWidgets'` – user081608 Jan 20 '17 at 01:54
  • @user081608 Interesting. I see a lot of references to [this on github](https://github.com/search?utf8=%E2%9C%93&q=from+PyQt5.QtWebKitWidgets+import+QWebPage&type=Code&ref=searchresults) – Jack Jan 20 '17 at 01:56
  • @user081608 Ahhh, you probably don't have it installed. See [this answer](http://stackoverflow.com/a/35096234/24998) – Jack Jan 20 '17 at 01:59
  • @user081608 What version of PyQt5 are you using? Please note that starting PyQt5.9, QtWebKitWidgets (and also QtWebKit) is not longer available (deprecated), resulting in the error you are getting. – John Doe Jul 30 '17 at 00:00
0

What version of PyQt5 are you using? Please note that starting PyQt5.9, QtWebKitWidgets (and also QtWebKit) is not longer available (deprecated), resulting in the error you are getting.

Let me juxtapose two rendering functions, one using old (PyQt4) and one using latest PyQt5:

Using PyQt4, note the usage of QWebPage and of course the imports


from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *

class Render(QWebPage):
    """Render HTML with PyQt4 WebEngine."""  
    def __init__(self, url):  
        self.app = QApplication(sys.argv)  
        QWebPage.__init__(self)  
        self.loadFinished.connect(self._loadFinished)  
        self.mainFrame().load(QUrl(url))  
        self.app.exec_()  

    def _loadFinished(self, result):  
        self.frame = self.mainFrame()  
        self.app.quit()

Using PyQt5, note the usage of QWebEngineView instead of QWebPage and of course the imports

 from PyQt5.QtCore import QEventLoop
 from PyQt5.QtWebEngineWidgets import QWebEngineView
 from PyQt5.QtWidgets import QApplication

    class Render(QWebEngineView):
        """Render HTML with PyQt5 WebEngine."""

        def __init__(self, html):
            self.html = None
            self.app = QApplication(sys.argv)
            QWebEngineView.__init__(self)
            self.loadFinished.connect(self._loadFinished)
            self.setHtml(html)
            while self.html is None:
                self.app.processEvents(
                    QEventLoop.ExcludeUserInputEvents |
                    QEventLoop.ExcludeSocketNotifiers |
                    QEventLoop.WaitForMoreEvents)
            self.app.quit()
Community
  • 1
  • 1
John Doe
  • 2,173
  • 1
  • 21
  • 12
  • The PyQt5 Render function takes html instead of a url as input though. How do we write the function to have a URL as input instead of html? – stogers Oct 08 '17 at 23:01