PyPDF4 Error [PdfReadWarning: Superfluous whitespace found in object header]

Question

import PyPDF4
path = f'C:/Users/Gabriel/Desktop/Curso/Teste/pdfs/teste/ABRAHAO.pdf'

pdf = open(path, 'rb')
reader = PyPDF4.PdfFileReader(pdf, strict=False)
page = reader.getPage(0)
text = page.extractText()
text = text.strip()

reading a pdf file, I tested it with another 295 files and they went smoothly

I was giving this error before, then I added the strict and it does not return anything — Espanholzx zx, Jan 24 '23 at 20:04
It's not an error, it is a warning message. It say that the PDF does not follow the specifications, but that pypdf can deal with that. — Martin Thoma, Apr 22 '23 at 17:07

score 1 · Answer 1 · answered Jan 24 '23 at 20:58

1

Add parameter "strict=false".
If this doesn't work then you can try using PyPDF2, tabula or py-pdf-parser.

answered Jan 24 '23 at 20:58

g.newt

105
3

PyPDF4 Error [PdfReadWarning: Superfluous whitespace found in object header]

1 Answers1