0

What I am trying to achieve is to replace a text in pdf file. I have the following code:

PdfReader reader = new PdfReader("test.pdf");

PdfDictionary dict = reader.getPageN(1);
PdfObject object = dict.getDirectObject(PdfName.CONTENTS);

if (object instanceof PRStream)
{
    PRStream stream = (PRStream) object;
    byte[] data = PdfReader.getStreamBytes(stream);
    System.out.println(new String(data));
    stream.setData(new String(data).replace("application", "HELLO WORLD").getBytes());
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("test-output.pdf"));
stamper.close();
reader.close();

When I trying to print out to see the data (System.out.println(new String(data))), "application" is showing as "ap)-4(plica)-3(tion", that's the reason why I failed to replace the text, any idea or other method that can achieve what I trying to achieve?

codingDummy
  • 122
  • 2
  • 14
  • 4
    PDF describes the graphical content of the page. It is meant to be in "final form". Depending on the generating program text is not always stored in an easily accessible way. For example, what you see here is kerning information. There is not much you can do about it. – Henry Oct 15 '18 at 03:31
  • There already are many questions on the topic of text replacement here on stack overflow. If you search for them, you'll see in the responses that your problem illustrates but one of the issues one might have. – mkl Oct 15 '18 at 04:18
  • Have you tried using Apache PDFBox ? (i have used it once, but only to create PDFs) – zealvault Oct 15 '18 at 04:57
  • Possible duplicate of [How to disable replacement of text characters with their graphic representation while printing PDF file?](https://stackoverflow.com/questions/9033782/how-to-disable-replacement-of-text-characters-with-their-graphic-representation) – JonathanDavidArndt Oct 15 '18 at 11:42

1 Answers1

1

You will not be able to do this with iText.

Believe me, this is one of the most frustrating discoveries about PDFs: you can build them with iText, but you cannot go back later and replace text with something else, as you have in your example.

There really is not much you can do about it. Once text is there, you can't modify it.


All that notwithstanding, you can usually ADD new content (text, images, etc.) to an existing PDF. So... if you can alter the universe slightly and create a PDF with empty space in the correct size, you can go back later and use the PdfStamper class to "stamp" on another layer of graphical content.

More on this can be found in the iText documentation, and in this fine question:

How to add Content to a PDF using iText PdfStamper

JonathanDavidArndt
  • 2,518
  • 13
  • 37
  • 49
  • *"You will not be able to do this with iText."* - not only with itext. Some trivial cases aside you won't find a satisfying solution in any pdf library. – mkl Oct 15 '18 at 04:20
  • Occasionally, you might also come across business people (some at the director level) who will ask, **"What do you mean you can't modify PDFs? We've been modifying PDFs for a long time!"** And then you discover they've actually been using Microsoft Word documents exported and opened with the free Adobe Reader. * *Sigh* * – JonathanDavidArndt Oct 30 '18 at 13:06