I am trying to split a pdf into its pages and save each page as a new pdf. I have tried this method from a previous question with no success and the pypdf2 split example from here with no success. EDIT: I can see in my files that it does successfully write the first page, the second page pdf is then created but is empty.
Here is the code I am trying to run:
from PyPDF2 import PdfFileWriter, PdfFileReader
inputpdf = PdfFileReader(open("my_pdf.pdf", "rb"))
for i in range(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(i))
with open("document-page%s.pdf" % i, "wb") as outputStream:
output.write(outputStream)
Here is the full error message:
Traceback (most recent call last):
File "pdf_functions.py", line 9, in <module>
output.write(outputStream)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 482, in write
self._sweepIndirectReferences(externalReferenceMap, self._root)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 557, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 572, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 548, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.4/dist-packages/PyPDF2/pdf.py", line 575, in _sweepIndirectReferences
if data.pdf.stream.closed:
AttributeError: 'PdfFileWriter' object has no attribute 'stream'
I also tried this and confirmed that I can indeed extract a single page.
from PyPDF2 import PdfFileWriter, PdfFileReader
inputpdf = PdfFileReader(open("/home/ubuntu/inputs/cityshape/form5.pdf", "rb"))
#for i in range(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(2))
with open("document-page2.pdf", "wb") as outputStream:
output.write(outputStream)