0

I'm currently working with PDFBox to create a PDF where the necessary spots are filled with the correct information. I have my string in the project, but for some reason, I cannot put that string into a pdf file with the doc.save(); functionality.

Is there any way to do such things, or should I approach this problem with a different manner?

public static void main(String[] args) {
        String fileName = "testPDF.pdf";
        
        try{
            PDDocument doc = PDDocument.load(new File("sample.pdf"));
            String text = new PDFTextStripper().getText(doc);

            String name = text.replace("nameReplace", "Example Ethan");
            
            doc.save(fileName);
            doc.close();

        }
        catch(IOException e){
            System.out.println(e.getMessage());
        }
    }
Fl00der
  • 3
  • 1
  • The text stripper merely gives you an independent string representing the text PDFBox could recognized. You cannot change the PDF by manipulating that string (and strictly speaking you don't even manipulate that string but create a new, again independent string...). – mkl Jun 18 '21 at 18:04
  • Then how should I approach this problem, if the text stripper is not the thing I'm looking for? – Fl00der Jun 18 '21 at 20:04
  • 1
    First of all, PDF is not designed for replacing text pieces in a search and replace manner, see [this answer](https://stackoverflow.com/a/60655298/1729265). Thus, you should reconsider the architecture of your task. Alternatives might be using pdfs with form fields to fill in, or doing search and replace in another, more suitable format and only thereafter exporting to pdf. – mkl Jun 19 '21 at 06:15
  • 1
    If you really cannot avoid having to search and replace text in pdfs, then there are two cases: 1. Your pdf is internally built in a very simple way, avoiding all the pitfalls mentioned in the answer referred to above. In this case you can do text replacement in content streams. 2. Otherwise you'll essentially have to apply redaction to the pdf to remove the text to replace and add new text there. – mkl Jun 19 '21 at 06:27
  • In your opinion, this text stripper option should be used for a pdf with fill forms, or is there another way to do it? I checked out the API documentation a few times, but I couldn't find anything related to my issue. – Fl00der Jun 20 '21 at 11:50
  • 1
    The text stripper is for text extraction. If you extend it a bit, for extraction of text with position or with some style information. It is not for text changing. For form field fill-ins look for `PDAcroForm` use in the API and the examples. – mkl Jun 20 '21 at 13:48

0 Answers0