0

I tried following this answer: How to use PyQT5 to convert multiple HTML docs to PDF in one loop

I modified it to convert all html files found in a local folder. For example htmls is a list of html files to be converted: [Q:\Ray\test1.html, Q:\Ray\prac2.html]

This is the code. However, when I try to run it, Python just freezes and I have to stop the run.

import os
import glob
from PyQt5 import QtWidgets, QtWebEngineWidgets

class PdfPage(QtWebEngineWidgets.QWebEnginePage):
    def __init__(self):
        super().__init__()
        self._htmls = []
        self._current_path = ""

        self.setZoomFactor(1)
        self.loadFinished.connect(self._handleLoadFinished)
        self.pdfPrintingFinished.connect(self._handlePrintingFinished)

    def convert(self, htmls):
        self._htmls = iter(zip(htmls))
        self._fetchNext()

    def _fetchNext(self):
        try:
            self._current_path = next(self._htmls)
        except StopIteration:
            return False

    def _handleLoadFinished(self, ok):
        if ok:
            self.printToPdf(self._current_path)

    def _handlePrintingFinished(self, filePath, success):
        print("finished:", filePath, success)
        if not self._fetchNext():
            QtWidgets.QApplication.quit()


if __name__ == "__main__":

    current_dir = os.path.dirname(os.path.realpath(__file__))
    folder= current_dir+ '\\*.HTML'
    htmls= glob.glob(folder)

    app = QtWidgets.QApplication([])
    page = PdfPage()
    page.convert(htmls)
    app.exec_()

    print("finished")
Ray234
  • 173
  • 5
  • 15
  • Where is the part where you should `load()` the urls? – musicamante Aug 12 '20 at 20:28
  • @musicamante what do you mean by load urls? I am trying to convert local files stored in the list htmls with their paths – Ray234 Aug 12 '20 at 20:37
  • 1
    Even if you're using local files, those file paths have to be *loaded* (using [QUrl](https://doc.qt.io/qt-5/qurl.html)). In your code you're only assigning the value of `self._current_path`, then you do *nothing else*. How should the webpage be loaded, then? Please, once you get some code in an answer, try your best to understand *what it does* and study the documentation related to it! Also, using `zip()` like that makes no sense at all. – musicamante Aug 12 '20 at 21:01

1 Answers1

3

It seems that the OP has not understood the logic of my previous solution which is:

  1. Get the resource, in this case files,
  2. Load it on the page,
  3. When the load is finished then print the content of the page,
  4. When the printing is finished then execute step 1 with the next resource.

In this it does not perform step 2, on the other hand it is recommended that the path of the pdf has a name other than the html

import os
import glob
from PyQt5.QtCore import QUrl
from PyQt5 import QtWidgets, QtWebEngineWidgets


class PdfPage(QtWebEngineWidgets.QWebEnginePage):
    def __init__(self):
        super().__init__()
        self._htmls = []
        self._current_path = ""

        self.setZoomFactor(1)
        self.loadFinished.connect(self._handleLoadFinished)
        self.pdfPrintingFinished.connect(self._handlePrintingFinished)

    def convert(self, htmls):
        self._htmls = iter(htmls)
        self._fetchNext()

    def _fetchNext(self):
        try:
            self._current_path = next(self._htmls)
        except StopIteration:
            return False
        else:
            self.load(QUrl.fromLocalFile(self._current_path))
        return True

    def _handleLoadFinished(self, ok):
        if ok:
            self.printToPdf(self._current_path + ".pdf")

    def _handlePrintingFinished(self, filePath, success):
        print("finished:", filePath, success)
        if not self._fetchNext():
            QtWidgets.QApplication.quit()


if __name__ == "__main__":

    current_dir = os.path.dirname(os.path.realpath(__file__))
    folder= current_dir+ '\\*.HTML'
    htmls = glob.glob(folder)
    print(htmls)
    if htmls:
        app = QtWidgets.QApplication([])
        page = PdfPage()
        page.convert(htmls)
        app.exec_()
    print("finished")
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • I tried implementing your code, but the program still only converts the first html file in the list instead of all of them. – Ray234 Aug 12 '20 at 20:54
  • @Ray234 Oops, try again. – eyllanesc Aug 12 '20 at 20:58
  • This works for a list of two html files but not more? – Ray234 Aug 12 '20 at 21:02
  • @Ray234 are you sure? I have tested it for 3 files and it works, could you point out the ones you get in the console? – eyllanesc Aug 12 '20 at 21:04
  • So the list of htmls generated is: [Q:\Ray\test1, Q:\Ray\test2, Q:\Ray\index1, Q:\Ray\index2] but only the first two from the list get converted – Ray234 Aug 12 '20 at 21:06
  • @Ray234 I do not understand you, I have placed "print" in my code to be able to analyze the operation, and for this reason I require that feedback, can you provide what I have asked? By example I get `['/home/Qt/index2.HTML', '/home/Qt/index3.HTML', '/home/Qt/index.HTML'] finished: /home/Qt/index2.HTML.pdf True finished: /home/Qt/index3.HTML.pdf True finished: /home/Qt/index.HTML.pdf True finished` – eyllanesc Aug 12 '20 at 21:08
  • The code works fine once I transferred my files to a different location! Sorry about the confusion – Ray234 Aug 12 '20 at 21:13