0

I have this demo program that is not working as I would wish:

import sys
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWidgets import *
from PyQt5.QtPrintSupport import *
from PyQt5.QtWebEngineWidgets import *

class Runnable(QRunnable):
    def __init__(self, window, mode):
        super(Runnable, self).__init__()
        self.window = window
        self.mode = mode
        self.html = '<!DOCTYPE HTML><html><body><p>test</p></body></html>'
        self.report_filename = 'report.pdf'

    def run(self):
        if self.mode == 'sync':
            # this works ok
            printer = QPrinter()
            printer.setOutputFormat(QPrinter.PdfFormat)
            printer.setPaperSize(QPrinter.A4)
            printer.setOutputFileName(self.report_filename)
            doc = QTextDocument()
            doc.setHtml(self.html)
            #doc.setPageSize(printer.pageRect().size())
            doc.print(printer)
            print('pdf file created')
        elif self.mode == 'async':
            # this doesn't work
            self.page = QWebEnginePage()
            self.page.loadFinished.connect(self.on_load_finished)
            loadFinished_works_for_setHtml = False
            if loadFinished_works_for_setHtml:
                # async func, but no loadFinished signal in docs, too bad
                self.page.setHtml(self.html)
            else:
                # silly, all because no loadFinished signal for setHtml()
                with open('report.html', 'w', encoding='utf-8') as f:
                    f.write(self.html)
                url = QUrl('file:report.html')
                print('url.isLocalFile():', url.isLocalFile())
                self.page.load(url)

    def on_load_finished(self, ok):
        # 1. problem: This method never executes, why?
        #             I tried with QWebEngineView() too, but without luck.
        # 2. problem: If this method somehow executes, it will run in main thread,
        #             but self.page.printToPdf() can be slow, so I want this to also
        #             run in runnable thread or some other thread, but not in main thread
        print('load finished, ok: ', ok)
        #self.page.pdfPrintingFinished.connect(self.on_pdf_printing_finished)
        #page_layout = QPageLayout(QPageSize(QPageSize.A4), QPageLayout.Portrait, QMarginsF(20, 20, 20, 20), QPageLayout.Millimeter, minMargins = QMarginsF(0, 0, 0, 0))
        #self.page.printToPdf(self.report_filename, page_layout)

    #def on_pdf_printing_finished(self, file_path, ok):
    #    print('printToPdf finished', file_path, ok)
    #    # send signal to main thread or open pdf file with subprocess.Popen() or sth.
    #    # 1a. problem: I want this (for example opening pdf file) to also run in 
    #    #              runnable thread or some other thread, but not in main thread

class Window(QWidget):
    def __init__(self):
        super(Window, self).__init__()
        self.print_sync_button = QPushButton('Make PDF (sync)', self)
        self.print_async_button = QPushButton('Make PDF (async)', self)
        self.print_sync_button.clicked.connect(lambda : self.handle_print('sync'))
        self.print_async_button.clicked.connect(lambda : self.handle_print('async'))
        layout = QHBoxLayout(self)
        layout.addWidget(self.print_sync_button)
        layout.addWidget(self.print_async_button)

    def handle_print(self, mode='sync'):
        worker = Runnable(self, mode)
        QThreadPool.globalInstance().start(worker)

if __name__ == '__main__':
    app = QApplication(sys.argv)
    window = Window()
    window.show()
    sys.exit(app.exec())

I described my problems in comments (1., 1a., 2.) in code.

My first problem is that self.page.load(url) never sends a signal loadFinished. I don't know why?

The second problem is more general: how to run async code in QRunnable run() method (if it is at all possible)? For example, I want to generate pdf report with QWebEnginePage (or QWebEngineView.page()) like QWebEnginePage.printToPdf(), but in that case I should use loadFinished signal for QWebEnginePage.load() and pdfPrintingFinished signal for QWebEnginePage.printToPdf(). Those signals will be connected to methods that won't run in QRunnable thread any more. They will run in main thread slowing the gui (those two methods can be slow, not to mention I want to open Adobe Reader with generated pdf document also in thread).

How to accomplish that all that code runs in thread (QRunnable or some other) and not to go back to main thread?

Similar question is here, but it seems it is pending without further discussion.

nenad
  • 164
  • 6
  • 17
  • AFAIK even if QWebEnginePage is not a "widget", it behaves as such and so it *must* be created and accessed from the main thread. Then, according to the docs of [`printToPdf`](https://doc.qt.io/qt-5/qwebenginepage.html#printToPdf), "This method issues an asynchronous request for printing the web page into a PDF and returns immediately.", so there's no need for QRunnable at all. That said, the QUrl is invalid: just like you cannot write a file name in the address field of a browser, you need the *absolute* path for QWebEngine. `QUrl.fromLocalFile(QDir.current().absoluteFilePath('report.html'))` – musicamante Jul 17 '21 at 00:45
  • I have a question: why is it necessary to use threads? – eyllanesc Jul 17 '21 at 01:10
  • @eyllanesc OP claims "self.page.printToPdf() can be slow, so I want this to also run in runnable thread or some other thread, but not in main thread". Since `printToPdf` is asynchronous I don't see where is the problem too. – musicamante Jul 17 '21 at 01:26
  • @musicamante That is my doubt, I want you to point out some proof of it since of the many times I have heard that it is because the OP has not implemented it correctly. I think the OP has an XY problem where the possible solution generates more problems than benefits. – eyllanesc Jul 17 '21 at 01:30
  • @eyllanesc well, I've just made a basic test, `printToPdf` is indeed asynchronous. It just takes some time for initialization. The requirement is that only one print can be done at once on the same printer (and from the same object requesting it), so maybe the problem is that the OP is trying to print more than once simultaneously from the same web page object. – musicamante Jul 17 '21 at 02:02
  • @eyllanesc thanks, it is indeed XY problem, I don't need threads for async parts – nenad Jul 17 '21 at 10:09
  • @musicamante _so maybe the problem is that the OP is trying to print more than once simultaneously from the same web page object._ no, just one web page object, print once – nenad Jul 17 '21 at 10:41

1 Answers1

1

There are various problems in your question and its code, based on wrong assumptions.

  1. while QWebEnginePage is not technically a widget, it behaves as such, so, just like any UI element, it cannot be created nor accessed from external threads (including using QRunnable);
  2. The given QUrl is invalid for this purpose: a web "browser" always uses absolute paths, including when loading local files. While file:report.html is an acceptable QUrl for local Qt file access, it is not a valid address from the point of view of the browser: if you try to write that path in your web browser, it won't work, even if the file is in the path of the browser executable.
    You can use QDir or QFileInfo to build the absolute path:
    • QUrl.fromLocalFile(QDir.current().absoluteFilePath('report.html'))
    • QFileInfo('report.html').absoluteFilePath()
  3. The loadFinished signal doesn't work for setHtml because that function doesn't "load" anything: it just sets the page content. Loading, in terms of web browser, means both setting the content and finishing its [down]loading;
  4. Printing must be queued on the same printer, as it's not possible to access a printer concurrently; since the pdf printer is obviously an abstract printer, it belongs to the web page object;

So, the solution is to not use QRunnable, but properly manage printing.
Then, you either create a QWebEnginePage for each document you need to print, or you use a unique QWebEnginePage and queue each loading and printing.

In either case, the provided QUrl must have a proper absolute path.

Obviously, there are differences.
The second option requires very limited resources, but complete queueing obviously means that the process is sequential, so it might take a lot of time.
The first option requires much more system resources (imagine it as having an opened tab for each document, and each web rendering is very CPU/RAM demanding, even if it's not displayed), but has the benefit of faster printing: since loading and printing are asynchronous you can actually process dozens of pages in a very short time interval.

A possible and safer solution is to create a system that limitates the concurrent number of pages and printing processes, and queues the remaining as soon as every printing is completed.

Finally, both QTextDocument and QPrinter are thread safe, so there's no problem in using them in a QRunnable, and they are not synchronous if you properly create a QRunnable that is being executed for each printing (which is one of the main purposes of QRunnable: being able to run it more than once). The only difference is that QTextDocument has a limited HTML support.

Zoe
  • 27,060
  • 21
  • 118
  • 148
musicamante
  • 41,230
  • 6
  • 33
  • 58
  • 1. True, that's why I get various warnings in runtime regarding threads 2. You are right, just to add that simple self.page.load(QUrl('report.html')) also works (the 'file:' part was problem) 3. Don't agree, see the docs for setHtml(), it says: "The html is loaded immediately; external objects are loaded asynchronously." and it uses timeout 10 secs, loadFinished signal works on setHtml() too, I tried it, but it is undocumented, I don't know if it's wise to use, but how else should we use setHtml() I don't know, because if loadFinished doesn't work for setHtml() it is useless... – nenad Jul 17 '21 at 10:07
  • ...because it is async, how will we know it finished loading all external objects, it is weird nothing about it is mentioned in docs. The rest, I agree, I don't need threads (or QRunnable) for async parts. Printing with QTextDocument (not to mention with with preview) is so slow for longer reports, and because of that I decided to use QWebEnginePage.printToPdf() as I thought it would be faster. I will combine QRunnable for sync parts, and main thread for async parts and connect it with signals and slots. Thanks for your comprehensive answer. – nenad Jul 17 '21 at 10:07
  • In [this answer](https://stackoverflow.com/questions/59274653/how-to-print-from-qwebengineview) @eyllanesc uses loadFinished signal with setHtml(). It is for QWebEngineView, not for QWebEnginePage, but in documentation for setHtml() method of both classes, loadFinished signal is not mentioned as a standard way to use setHtml() method. Is it an omission in docs? – nenad Jul 17 '21 at 11:26
  • @nenad 2. It might, but I wouldn't rely too much on relative paths; 3. Since I doubt we can completely rely on that signal, then I'd say that there's no doubt in using `load()` anyway. I don't really know if it's undocumented for omission or anything else, and I don't know why it's not always fired. Besides that, consider that QTextDocument and QWebEngine can render the same html source very differently. – musicamante Jul 17 '21 at 15:45