0

I am trying to replace text in PDF We have a huge amount of PDFs (900k +)which needs some text replacement, only on first page

Name, Address email and number

I have tried many solutions with Python and Perl as well but I landed into different errors, "index not found", "codec cant decode" etc.

The only solution that worked so far is asposepdfcloud api. which is a cloud based api and I dont know what's happening behind. and We cannot process our files via external API due to privacy concerns

I was wandering if there is something similar like this.

I also have a bot using uipath, which opens the PDF file in Acrobat pro and do search and replace and then save the document, but this whole process takes around 1 min for 1 file at least. So this is not a suitable solution for the huge amount of files.

Any help would be appreciated. P.S. Its not currently possible to replace text before creating and regenerate all PDFs again.

Ahmed Sunny
  • 2,160
  • 1
  • 21
  • 27

1 Answers1

1

One of the possible ways to solve this problem is to find the section you want to replace using "regex". Then using one of the libraries for pdf editing (such as "pdfplumber") in python to replace this section.

Errors you are getting are possible to be handled. If it does not bother privacy I can take a look at one of these PDFs and provide a more detailed solution.