0

Does anyone have tried before to replace text from a PDF File using Fitz of PyMuPDF Library ?

i have tried to use the code below and i am not sure if i am close to the result or it's impossible to use using this library:

import fitz

file_name = 'D:/DOSSIERS/pdf_file.pdf'

with fitz.Document(file_name) as doc:
  for page in doc:
    for xref in page.get_contents():
        stream = doc.xref_stream(xref).replace(b'mis',b'kjhkj')
        doc.update_stream(xref, stream)
  • Manipulating content streams like that only works for pdfs with a special, simple internal structure. – mkl Mar 17 '21 at 06:34
  • thanks for your answer ! i only have to replace text in a PDF with simple internal structure. – REDA DRISSI Mar 18 '21 at 17:23
  • Ok, so if you apply your code to your test file, does 'mis' get replaced by 'kjhkj'? If not, the internal structure of the PDF may not be simple enough. – mkl Mar 21 '21 at 15:39

0 Answers0