PyPDF2.PdfFileReader hangs indefinitely

Question

I'm trying to read this pdf file (https://www.accessdata.fda.gov/cdrh_docs/pdf14/K141693.pdf) and am following these suggestions from SO

Opening pdf urls with pyPdf

I have actually downloaded the file locally and am running the following code

import PyPDF2
pdf_file = open("K141693.pdf")
pdf_read = PyPDF2.PdfFileReader(pdf_file)

but my code hangs indefinitely. I'm running Python 2.7 and here is the stacktrace.

Traceback (most recent call last):
File "", line 1, in runfile('C:/PoC/pdf_reader.py', wdir='C:/PoC')

File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile execfile(filename, namespace)

File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)

File "C:/PoC/pdf_reader.py", line 13, in pdf_read = PyPDF2.PdfFileReader(pdf_file)

File "C:\ProgramData\Anaconda2\lib\site-packages\PyPDF2\pdf.py", line 1084, in init self.read(stream)

File "C:\ProgramData\Anaconda2\lib\site-packages\PyPDF2\pdf.py", line 1697, in read line = self.readNextEndLine(stream)

File "C:\ProgramData\Anaconda2\lib\site-packages\PyPDF2\pdf.py", line 1938, in readNextEndLine x = stream.read(1)

KeyboardInterrupt

I came across another post here PyPDF2 hangs on processing but that too doesn't have a response.

SAME. it's stuck in an infinite loop somehow. Did you ever resolve this? — Celi Manu, Sep 24 '18 at 21:27
If you haven't added an issue on PyPDF2, including your code and the post and the PyPDF2 version, I suggest you should do that — Martin Thoma, Apr 10 '22 at 19:43

score 0 · Answer 1 · answered Jul 13 '18 at 14:21

0

You need to parse the file in binary ('rb') mode. (This works in Python 3:)

import PyPDF2
pdf_file = open("K141693.pdf", "rb")
read_pdf = PyPDF2.PdfFileReader(pdf_file)

answered Jul 13 '18 at 14:21

PythonSherpa

2,560
3
19
40

1

Still hanging up. – bmc Dec 13 '18 at 16:10

PyPDF2.PdfFileReader hangs indefinitely

1 Answers1