1

I have several Pdfs that I want to merge together. To do this I referenced this https://pythonhosted.org/PyPDF2/PdfFileMerger.html#PyPDF2.PdfFileMerger.write Documentation as well as this Merge PDF files post as reference. My code reads from the the directory where my pdfs lie and tries to write the new pdf into another directory

    def concatenate_pdfs(path_to_pdf_dir, output_dir):
        """merge all the pdfs that lie in output_dir into one pdf and store it in path_to_pdf_dir"""
        merger = PyPDF2.PdfFileMerger()
        for filepath in glob(f'{output_dir}*.pdf'):
            merger.append(filepath)
        merger.write(f'{path_to_pdf_dir}/output.pdf')
        merger.close()

The new pdf is created but only with the first pdf that gets parsed.

The same problem occurs when I do it like this:

def concatenate_pdfs(path_to_pdf_dir, output_dir):
    """merge all the pdfs that lie in output_dir into one pdf and store it in path_to_pdf_dir"""
    merger = PyPDF2.PdfFileMerger()
    f1 = 'path_to_first_pdf'
    f2 = 'path_to_second_pdf'
    f3 = 'path_to_nth_pdf'
    merger.append(f1)
    merger.append(f2)
    merger.append(f3)
    merger.write(f'{path_to_pdf_dir}/output.pdf')
    merger.close()

In this case only f1 gets written to my output.pdf

David Haase
  • 179
  • 1
  • 8
  • 1
    Does the bug persist if you do ```merger.append(open(fx, 'rb'))```? – Tobi208 Jan 26 '22 at 09:40
  • @Tobi208 yes it does. Same result – David Haase Jan 26 '22 at 09:45
  • 1
    Wait. No this solved it. Now they get shown as expected. I had a type while trying your fix. Thank you – David Haase Jan 26 '22 at 09:49
  • Awesome! The file reader/writer in PyPDF can be a bit iffy. – Tobi208 Jan 26 '22 at 09:53
  • Yes, seems like it! Something somewhat unrelated: Do you - by coincidence - know how to add a title before each new concatenated Pdf? I was able to add the bookmarks but not the title – David Haase Jan 26 '22 at 09:57
  • Do you mean title in the metadata or a title page before each merge or a title at the top of the page? – Tobi208 Jan 26 '22 at 10:01
  • A title at the top of the page before each new merged pdf. F.e.: Merged pdf 1 title: content of merged pdf 1 Merged pdf 2 title: content of merged pdf 2 – David Haase Jan 26 '22 at 10:05
  • Obviously with a new line after the title and - if possible - with a bigger font size. – David Haase Jan 26 '22 at 10:06
  • You might want to look into [this](https://stackoverflow.com/questions/1180115/add-text-to-existing-pdf-using-python). Don't think you can do it with a merger. – Tobi208 Jan 26 '22 at 10:18

1 Answers1

2

For posterity:

Add files to the merger with merger.append(open(filepath, 'rb')) because PyPDF has some odd internal issues with file reader/writers.

Tobi208
  • 1,306
  • 10
  • 17