1

I have managed to use suggested code in order to render HTML from a webpage and then parse, find and use the text as wanted. I'm using PyQt4. However, the webpage I am interested in is updated frequently and I want to rerender the page and check the updated HTML for new info.

I thus have a loop in my pythonscript so that I sort of start all over again. However, this makes the program crash. I have searched the net and found out that this is to be expected, but I have not found any suggestion on how to do it correctly. It must be simple, I guess?

from PyQt4.QtGui import *

from PyQt4.QtCore import *

from PyQt4.QtWebKit import *


class Render (QWebPage):

    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()
    def _loadFinished(self, result):
        self.frame = self.mainFrame()
        self.app.quit()

r = Render(url)  

html = r.frame.toHtml()

S,o when I hit r=Render(url) the second time, it crashes. S,o I am looking for something like r = Rerender(url).

As you might guess, I am not much of a programmer, and I usually get by by stealing code I barely understand. But this is the first time I can't find an answer, so I thought I should ask a question myself.

I hope my question is clear enough and that someone has the answer.

Dan
  • 10,614
  • 5
  • 24
  • 35
Pal
  • 13
  • 2
  • Possible duplicate of [PyQt Class not working for the second usage](http://stackoverflow.com/questions/21909907/pyqt-class-not-working-for-the-second-usage) – ekhumoro Nov 13 '15 at 18:21
  • Thanks for suggestion. This is one of the posts I have read and not thinking it would help me. I will take a closer look again and if it indeed does not solve my problem I will explain why – Pal Nov 13 '15 at 20:13
  • The suggested solution is rather handling how to render several different webpages that is specified in a list. This would probably work if I make a long list specifying the same url many times, but does not seem to be an efficient solution. I am not an enough skilled programmer to see how I could modify the code to my needs. My hope was that someone else actually had done this as I thought it was not an unusal functionality. I will keep look around to see if i find an answer and will post here if I do find something – Pal Nov 13 '15 at 20:33
  • You obviously didn't read the answer properly, because it very clearly does exactly what you want, with only minor modifications required. Anyway, I have posted another simplified demo which hopefully makes things even clearer. – ekhumoro Nov 13 '15 at 21:59

1 Answers1

0

Simple demo (adapt to taste):

import sys, signal
from PyQt4 import QtCore, QtGui, QtWebKit

class WebPage(QtWebKit.QWebPage):
    def __init__(self, url):
        super(WebPage, self).__init__()
        self.url = url
        self.mainFrame().loadFinished.connect(self.handleLoadFinished)
        self.refresh()

    def refresh(self):
        self.mainFrame().load(QtCore.QUrl(self.url))

    def handleLoadFinished(self):
        print('Loaded:', self.mainFrame().url().toString())
        # do stuff with html ...
        print('Reloading in 3 seconds...\n')
        QtCore.QTimer.singleShot(2000, self.refresh)

if __name__ == '__main__':

    signal.signal(signal.SIGINT, signal.SIG_DFL)
    app = QtGui.QApplication(sys.argv)
    webpage = WebPage('http://en.wikipedia.org/')
    print('Press Ctrl+C to quit\n')
    sys.exit(app.exec_())
ekhumoro
  • 115,249
  • 20
  • 229
  • 336
  • Thanks again. Your example is giving me a webpage object that is refreshed as desired. My problem now is that I don't know how to "do stuff with html" as my knowledge of object handling is non-existing. How do I actually access the html in the object? I know this is a very basic thing and that I should be able to find out by learning the basics. This is also why I am a little bit uncomfortable in asking questions here and setting people in work without being fully capable of using the answers. Thanks again and I will look more into both examples again as well into basic documentation. – Pal Nov 14 '15 at 10:26
  • Ah, I found the answer in the first example, so I just added html = self.mainFrame().toHtml() below your "do stuff with html" and it seems to work. – Pal Nov 14 '15 at 10:39