1

I am using below to combine separate PDF files, into a single PDF.

It works fine however leaving all the PDFs open. How can I close the PDF files involved when the scripts ends (i.e. 4 files including the aaa, bbb, ccc and abc)?

Such as f.clos() but I have no idea how to insert here.

from pyPdf import PdfFileWriter, PdfFileReader

def append_pdf(input,output):
    [output.addPage(input.getPage(page_num)) for page_num in range(input.numPages)]

output = PdfFileWriter()

append_pdf(PdfFileReader(file("C:\\aaa.pdf","rb")),output)
append_pdf(PdfFileReader(file("c:\\bbb.pdf","rb")),output)
append_pdf(PdfFileReader(file("c:\\ccc.pdf","rb")),output)

output.write(file("c:\\abc.pdf ","wb"))

The problem is that when I tried to delete the files, Windows pops up:

the action can't be completed because the files is open in pythonw

(I am using Python 2.76 so changed the line in Robᵩ's 1st attempt to inputFile.close()).

halfer
  • 19,824
  • 17
  • 99
  • 186
Mark K
  • 8,767
  • 14
  • 58
  • 118
  • 2
    I suggest reading the [Python tutorial on reading and writing files](https://docs.python.org/3.4/tutorial/inputoutput.html#reading-and-writing-files), especially the last code block in the section. – Blender Jul 21 '14 at 09:38
  • @Mark, what makes you believe that it leaves the files open? – Robᵩ Jul 21 '14 at 15:31
  • @Robᵩ, because when I tried to delete the files, system pops up "the action can't be completed because the files is open in pythonw" – Mark K Jul 22 '14 at 02:11

2 Answers2

0

All of the files are automatically closed by the time the script finishes execution. If you'd like to close them before the script ends, call the file.close() function. Here is one way:

# UNTESTED
for fname in 'c:/aaa.pdf', 'c:/bbb.pdf', 'c:/ccc.pdf':
    inputFile = open(fname, 'rb')
    append_pdf(PdfFileReader(inputFile), output)
    close(inputFile)

As you can see, each input file is closed immediately after being used. This does cause one problem, however: if PdfFileRead() or append_pdf() were to throw an exception, then close() would never be called. To solve that problem, we use a context manager:

# UNTESTED
for fname in 'c:/aaa.pdf', 'c:/bbb.pdf', 'c:/ccc.pdf':
    with open(fname, 'rb') as inputFile:
        append_pdf(PdfFileReader(inputFile), output)

Each file will be closed when the with block exits.

Similarly for the output file:

# UNTESTED
with open('c:/abc.pdf', 'wb') as outputFile:
    output.write(outputFile)
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • thanks Robᵩ. but the problem still. I tried the 1st attempt, it works fine but (in Windows)I am still not able to delete the files in folder. for the 2nd and 3rd attempts, it gives an error "I/O operation on closed file" and the problem lies in "output.write". could you please help? thanks. – Mark K Jul 22 '14 at 02:34
0

learned from here, pypdf Merging multiple pdf files into one pdf

found using PyPDF2 can achieve the same goal, and the problem of files can't be deleted is re resolved.

from PyPDF2 import PdfFileMerger, PdfFileReader

merger = PdfFileMerger()

filenames = ['c:\\11.pdf', 'c:\\22.pdf', 'c:\\33.pdf']

for filename in filenames:
    merger.append(file_folder + filename)

merger.write('c:\\123.pdf')
Mark K
  • 8,767
  • 14
  • 58
  • 118
  • I get an error on `merger.append(PdfFileReader(file(filename, 'rb')))` what does rb mean? How can I fix this? I have a similar setup to your code. – excelguy Jul 18 '18 at 13:43
  • it's the line to append each page to the merger. above lines are quite straight forward. Maybe you can also try Rob's answer below as well? – Mark K Jul 19 '18 at 22:01
  • 'file' is not defined? – Andrew Clark May 16 '21 at 01:11
  • @Andrew Clark, I've updated the answer. btw, loop the files in a folder might help to avoid the PdfReadWarnings. – Mark K May 17 '21 at 00:53