5
import PyPDF2 
import glob
import os
from fpdf import FPDF
import shutil

class MyPDF(FPDF): # adding a footer, containing the page number
    def footer (self):
        self.set_y(-15)
        self.set_font("Arial", Style="I", size=8)
        pageNum = "page %s/{nb}" % self.page_no()
        self.cell(0,10, pageNum, align="C")


if __name__ == "__main__":
    os.chdir("pathtolocation/docs/") # docs location
    os.system("libreoffice --headless --invisible --convert-to pdf *") # this converts everything to pdf
    for file in glob.glob("*"):
        if file not in glob.glob("*.pdf"):
            shutil.move(file,"/newlocation") # moving files we don't need to another folder

    # adding the cover and footer
    path = open(file, 'wb')
    path2 = open ('/pathtocover/cover.pdf')
    merger = PyPDF2.PdfFileMerger()
    pdf = MyPDF()

    for file in glob.glob("*.pdf"):
        pdf.footer()
        merger.merge(position=0, fileobj=path2)
        merger.merge(position=0, fileobj=path)
        merger.write(open(file, 'wb'))

This script converts to pdf, add a cover to the pdf and footer containing the page number, fixed some stuff and now I run it for the last time to see if it's working, it's taking too much time, no error, did I do something wrong or does it need that long to merge and add footers? I'm working with 3 files, and it converted them so fast.

Exception output

convert /home/projects/convert-pdf/docs/sample (1).doc ->
/home/projects/convert-pdf/docs/sample (1).pdf using writer_pdf_Export

so it is converting and moving, I think the problem is somewhere here

   for file in glob.glob("*.pdf"):
        pdf.footer()
        merger.merge(position=0, fileobj=path2)
        merger.merge(position=0, fileobj=path)
        merger.write(open(file, 'wb'))

Since I'm trying to merge position=0 with position=0, not sure about it though

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
Lynob
  • 5,059
  • 15
  • 64
  • 114
  • Try throwing some intermediate ```print``` statements in there to see exactly where its hanging – wnnmaw Jan 23 '15 at 17:09
  • @wnnmaw my initial thought was at the very end but it's not even converting the files, i mean previously there were errors but it passed the stage of converting and moving but not now – Lynob Jan 23 '15 at 17:11
  • @picus i'll do that now – Lynob Jan 23 '15 at 17:11
  • I just moved to the answer section to show some samples, don't copy and paste, formatting might get messed up. – picus Jan 23 '15 at 17:20
  • @wnnmaw please read my edit – Lynob Jan 23 '15 at 17:39
  • Save the PDFs to disk and use pdftk using the following command. `pdftk file1.pdf file2.pdf file3.pdf cat output newfile.pdf`. Make sure pdftk is installed on your system. Edit: You could also use `pdftk *.pdf cat output newfile.pdf` – norway-firemen Jan 30 '15 at 03:54
  • @12hys can I add a footer using that tool too? – Lynob Jan 30 '15 at 18:27
  • @Fischer, I don't believe it does. Are you able to successfully add the footers to the separate PDFs? If so, save the PDFs and use pdftk on the saved files. I would first check to see what is actually taking a really long time. Is it adding the footers or merging? Comment out the merger code to see if that's what is causing the slowdown. – norway-firemen Jan 31 '15 at 16:18
  • @12hys the error is definitely from the adding footer section – Lynob Jan 31 '15 at 21:37

1 Answers1

4

This is actually better as a comment, but I want to show code. You need to add some try blocks in there to catch any errors - here is something super basic you coudld do.

import PyPDF2 
import glob
import os
from fpdf import FPDF
import shutil

class MyPDF(FPDF): # adding a footer, containing the page number
    def footer (self):
      try:
        self.set_y(-15)
        self.set_font("Arial", Style="I", size=8)
        pageNum = "page %s/{nb}" % self.page_no()
        self.cell(0,10, pageNum, align="C")
      except Exception, err:
        print "Error applying footer: {}".format(err)


if __name__ == "__main__":

  try:
    os.chdir("pathtolocation/docs/") # docs location
    os.system("libreoffice --headless --invisible --convert-to pdf *") # this converts everything to pdf
    for file in glob.glob("*"):
        if file not in glob.glob("*.pdf"):
            shutil.move(file,"/newlocation") # moving files we don't need to another folder

    # adding the cover and footer
    path = open(file, 'wb')
    path2 = open ('/pathtocover/cover.pdf')
    merger = PyPDF2.PdfFileMerger()
    pdf = MyPDF()
  except Exception, err:
    print "error setting up the pdf: {}".format(err)

    for file in glob.glob("*.pdf"):
      try:
        pdf.footer()
        merger.merge(position=0, fileobj=path2)
        merger.merge(position=0, fileobj=path)
        merger.write(open(file, 'wb'))
      except Exception, err:
        print "Error processing glob: {}".format(err)
picus
  • 1,507
  • 2
  • 13
  • 23
  • thank you for writing the exceptions, you didn't have to, still don't get errors but at least i know that it past the conversion point and it moved the files – Lynob Jan 23 '15 at 17:37