29

Is there a way I can edit a PDF from Java?
I have a PDF document which contains placeholders for text that I need to be replaced using Java, but all the libraries that I saw created PDF from scratch and small editing functionality.
Is there anyway I can edit a PDF or is this impossible?

Igor G.
  • 6,955
  • 6
  • 26
  • 26
Ammar
  • 5,070
  • 8
  • 28
  • 27

5 Answers5

14

You can do it with iText. I tested it with following code. It adds a chunk of text and a red circle over each page of an existing PDF.

/* requires itextpdf-5.1.2.jar or similar */
import java.io.*;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.*;

public class AddContentToPDF {

    public static void main(String[] args) throws IOException, DocumentException {

        /* example inspired from "iText in action" (2006), chapter 2 */

        PdfReader reader = new PdfReader("C:/temp/Bubi.pdf"); // input PDF
        PdfStamper stamper = new PdfStamper(reader,
          new FileOutputStream("C:/temp/Bubi_modified.pdf")); // output PDF
        BaseFont bf = BaseFont.createFont(
                BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED); // set font

        //loop on pages (1-based)
        for (int i=1; i<=reader.getNumberOfPages(); i++){

            // get object for writing over the existing content;
            // you can also use getUnderContent for writing in the bottom layer
            PdfContentByte over = stamper.getOverContent(i);

            // write text
            over.beginText();
            over.setFontAndSize(bf, 10);    // set font and size
            over.setTextMatrix(107, 740);   // set x,y position (0,0 is at the bottom left)
            over.showText("I can write at page " + i);  // set text
            over.endText();

            // draw a red circle
            over.setRGBColorStroke(0xFF, 0x00, 0x00);
            over.setLineWidth(5f);
            over.ellipse(250, 450, 350, 550);
            over.stroke();
        }

        stamper.close();

    }
}
bluish
  • 26,356
  • 27
  • 122
  • 180
  • 11
    This example overlays the page with your new text and the ellipse. Is there a way to modify the text in place? Is there way to search for, say a token, and *replace* it with my text rather than to overlay it? – Vihung Jun 17 '15 at 17:19
  • error....... Fatal signal 7 (SIGBUS), code 2, fault addr 0xa290903f in tid 25590 (om.pdfgenerator) – Sunil Chaudhary Oct 06 '15 at 10:55
  • I have failed editing text in the existing pdf. Extracted text is a bunch of values similar to: "(>) Tj". Tutorial link: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs-itext5/replacing-pdf-objects – Igor G. Jan 26 '18 at 14:48
  • 1
    @IgorG. concerning that "Tutorial link": You surely have seen the link at the top of the JavaDoc there. It points to [this SO answer](https://stackoverflow.com/a/21622539/1729265) in which Bruno in particular states that *if your PDFs are relatively simple* you can use that code but that *in real life, PDFs are never that simple*... That you have *failed editing text in the existing pdf*, therefore, is only to be expected! If you only want to edit very specific documents, create an own question from that and supply examples. Don't expect a generic solution, though! – mkl Jan 26 '18 at 15:58
  • Hi, is there any way to do this in PHP? – Aramis Rodríguez Blanco Nov 15 '19 at 09:55
  • @AramisRodríguezBlanco There are already some questions about it: https://stackoverflow.com/search?q=Editing+PDF+text+%5Bphp%5D ;) – bluish Nov 15 '19 at 13:36
3

I modified the code found a bit and it was working as follows

public class Principal {
public static final String SRC = "C:/tmp/244558.pdf";
public static final String DEST = "C:/tmp/244558-2.pdf";

public static void main(String[] args) throws IOException, DocumentException {
    File file = new File(DEST);
    file.getParentFile().mkdirs();
    new Principal().manipulatePdf(SRC, DEST);
}

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfDictionary dict = reader.getPageN(1);
    PdfObject object = dict.getDirectObject(PdfName.CONTENTS);
    PdfArray refs = null;
    if (dict.get(PdfName.CONTENTS).isArray()) {
        refs = dict.getAsArray(PdfName.CONTENTS);
    } else if (dict.get(PdfName.CONTENTS).isIndirect()) {
        refs = new PdfArray(dict.get(PdfName.CONTENTS));
    }
    for (int i = 0; i < refs.getArrayList().size(); i++) {
        PRStream stream = (PRStream) refs.getDirectObject(i);
        byte[] data = PdfReader.getStreamBytes(stream);
        stream.setData(new String(data).replace("NULA", "Nulo").getBytes());
    }
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.close();
    reader.close();
}

}

  • That kind of text replacement works only under very specific circumstances (fonts in the pdf must use an ASCII-like encoding and must not be embedded as incomplete subsets only; furthermore the generator must not have applied kerning or a similar technique that splits text lines into separate chunks). And you also only change contents of page content streams. – mkl Oct 12 '18 at 04:58
2

Take a look at iText and this sample code

1

Take a look at aspose and this sample code

Ravi Kant
  • 4,785
  • 2
  • 24
  • 23
  • 4
    both links are broken – Edoardo Jun 17 '17 at 06:39
  • Archive of the first link: http://web.archive.org/web/20130625054957/http://www.aspose.com/docs/display/pdfkitjava/how+to+fill+form+fields+with+api And the second one: http://web.archive.org/web/20130330113945/http://www.aspose.com/docs/display/pdfkitjava/Manipulate+text+and+images+in+an+existing+PDF+File I also found this: https://products.aspose.com/pdf/java – Fabian Röling Oct 14 '19 at 19:33
0

I've done this using LibreOffice Draw.

You start by manually opening a pdf in Draw, checking that it renders OK, and saving it as a Draw .odg file.

That's a zipped xml file, so you can modify it in code to find and replace the placeholders.

Next (from code) you use a command line call to Draw to generate the pdf.

Success!

The main issue is that Draw doesn't handle fonts embedded in a pdf. If the font isn't also installed on your system - then it will render oddly, as Draw will replace it with a standard one that inevitably has different sizing.

If this approach is of interest, I'll put together some shareable code.

Daniel Winterstein
  • 2,418
  • 1
  • 29
  • 41