The following code tries to edit part of text in a PDF file:
from PyPDF2 import PdfReader, PdfWriter
replacements = [("Failed", "Passed")]
pdf = PdfReader(open("2.pdf", "rb"))
writer = PdfWriter()
for page in pdf.pages:
contents = page.get_contents().get_data()
#print(contents) old contents
for (a, b) in replacements:
contents = contents.replace(str.encode(a), str.encode(b))
#print(contents) new contents which has 'Passed' as new value
page.get_contents().set_data(str(contents)) #Issue occurs here
writer.add_page(page)
with open("2_modified.pdf", "wb") as f:
writer.write(f)
Keep getting into below issue:
Traceback (most recent call last):
File "/pdf_editor.py", line 14, in <module>
page.get_contents().set_data(str(contents)) #Issue occurs here
File "/venv/lib/python3.9/site-packages/PyPDF2/generic/_data_structures.py", line 839, in set_data
raise PdfReadError("Creating EncodedStreamObject is not currently supported")
PyPDF2.errors.PdfReadError: Creating EncodedStreamObject is not currently supported
I tried with solutions mentioned here which did not work, also found this github link which has a lable "bug" but with no further updates.
UPDATE:
I had tried the library which was in comments earlier did not pursue for two reasons:
- Seems not used widely
- Kept getting one or other issue last one being 'apply_redact_annotations' error
So wanted to know any other work around or any other good libraries to achieve this