I am trying to replace text strings in a PDF file using the Python code below.
import PyPDF2
reader = PyPDF2.PdfFileReader('document.pdf', strict=True, warndest=None, overwriteWarnings=True)
writer = PyPDF2.PdfFileWriter()
replacements = {'old' : 'new'}
P = reader.getNumPages()
for p in range(P):
page = reader.getPage(p)
contents = page.getContents()
bdata = contents.getData()
ddata = bdata.decode('utf-8') #decoded data (string)
for key in replacements.keys():
ddata = ddata.replace(key, replacements[key])
contents.setData(ddata.encode('utf-8')) #Error occurs here
#page.setContents(contents)
writer.addPage(page)
with open("result.pdf", 'wb') as f:
writer.write(f)
The problem is that contents.setData
raises PdfReadError: Creating EncodedStreamObject is not currently supported
.
Can anybody think of a workaround?
P.S. Applying the method described here did create a new PDF file but without replacements.