How to retrieve the image of a PdfStampAnnotation

Question

I created a pdf using the following example: https://developers.itextpdf.com/examples/actions-and-annotations/clone-creating-and-adding-annotations#2260-addstamp.java

@Category(SampleTest.class)
public class AddStamp extends GenericTest {
    public static final String DEST = "./target/test/resources/sandbox/annotations/add_stamp.pdf";
    public static final String IMG = "./src/test/resources/img/itext.png";
    public static final String SRC = "./src/test/resources/pdfs/hello.pdf";

    public static void main(String[] args) throws Exception {
        File file = new File(DEST);
        file.getParentFile().mkdirs();
        new AddStamp().manipulatePdf(DEST);
    }

    @Override
    protected void manipulatePdf(String dest) throws Exception {
        PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC), new PdfWriter(DEST));

        ImageData img = ImageDataFactory.create(IMG);
        float w = img.getWidth();
        float h = img.getHeight();
        Rectangle location = new Rectangle(36, 770 - h, w, h);
        PdfStampAnnotation stamp = new PdfStampAnnotation(location)
            .setStampName(new PdfName("ITEXT"));
        PdfFormXObject xObj = new PdfFormXObject(new Rectangle(w, h));
        PdfCanvas canvas = new PdfCanvas(xObj, pdfDoc);
        canvas.addImage(img, 0, 0, false);
        stamp.setNormalAppearance(xObj.getPdfObject());
        stamp.setFlags(PdfAnnotation.PRINT);

        pdfDoc.getFirstPage().addAnnotation(stamp);
        pdfDoc.close();
    }
}

The pdf is properly created and contains the stamp annotation

I can get the annotation using:

...
PdfStampAnnotation s = (PdfStampAnnotation) pdfDoc.getFirstPage().getAnnotations().get(0);
s.?????

How can I get back the image (itext.png) of the stamp (eg: byte[]) ? I'm really new to itext and after hours of research I'm stuck at this point...

score 0 · Accepted Answer · answered Jul 01 '18 at 14:54

First of all, you won't get the original image back. PDF support only very few bitmap image formats as they are: JPEG, JPEG2000, certain fax formats, but definitively not PNG. PNGs are converted into the PDF internal bitmap format, and upon extraction can best be converted back to a PNG.

Furthermore, the reason why there is no simple getImage method in the PdfStampAnnotation class is that the appearance of a stamp may be constructed like the contents of a regular page, it may contain text, it may contain vector graphics, it may contain bitmap images, it may contain an arbitrary mixture of those elements. Thus, all you can retrieve from an annotation is its appearance.

If you are sure an annotation contains only an image (or you at least are not interested in anything but the image), you can extract that image using the iText parser framework, e.g. like this:

Map<byte[], String> extractAnnotationImages(PdfStream xObject) {
    final Map<byte[], String> result = new HashMap<>();
    IEventListener renderListener = new IEventListener() {
        @Override
        public Set<EventType> getSupportedEvents() {
            return Collections.singleton(RENDER_IMAGE);
        }

        @Override
        public void eventOccurred(IEventData data, EventType type) {
            if (data instanceof ImageRenderInfo) {
                ImageRenderInfo imageRenderInfo = (ImageRenderInfo) data;
                byte[] bytes = imageRenderInfo.getImage().getImageBytes();
                String extension = imageRenderInfo.getImage().identifyImageFileExtension();
                result.put(bytes, extension);
            }
        }
    };

    PdfCanvasProcessor processor = new PdfCanvasProcessor(renderListener, Collections.emptyMap());
    processor.processContent(xObject.getBytes(), new PdfResources(xObject.getAsDictionary(PdfName.Resources)));

    return result;
}

(ExtractAnnotationImage method)

which returns a mapping from image byte arrays to file extension to use.

I used it in this helper method:

void saveAnnotationImages(PdfDocument pdfDocument, String prefix) throws IOException {
    for (int pageNumber = 1; pageNumber <= pdfDocument.getNumberOfPages(); pageNumber++) {
        PdfPage page = pdfDocument.getPage(pageNumber);
        int index = 0;
        for (PdfAnnotation annotation : page.getAnnotations()) {
            PdfDictionary normal = annotation.getAppearanceObject(PdfName.N);
            if (normal instanceof PdfStream) {
                Map<byte[], String> images = extractAnnotationImages((PdfStream)normal);
                for (Map.Entry<byte[], String> entry : images.entrySet()) {
                    Files.write(new File(String.format("%s-%s-%s.%s", prefix, pageNumber, index++, entry.getValue())).toPath(), entry.getKey());
                }
            }
        }
    }
}

(ExtractAnnotationImage helper method)

to extract all images from annotations from the output of the iText example AddStamp you reference and got one image:

By the way, you'll recognize here that transparency is missing. Transparency in the PDF is modeled via a second image, a mask image, which effectively represents something like an alpha channel. One can retrieve this mask from the ImageRenderInfo.getImage() object.

How to retrieve the image of a PdfStampAnnotation

1 Answers1

Linked