5

Is it possible to extract the visible signature (image) of an signed PDF with the OSS library PDFBox?

Workflow:

  1. list all signatures of a file
  2. show which signatures include a visible signature
  3. show which are valid
  4. extract images of signatures (need to extract correct image for each signature)

Something in oop style like following would be awesome:

PDFSignatures [] sigs = document.getPDFSignatures()
sig[0].getCN()
...
(Buffered)Image visibleSig = sig[0].getVisibleSignature()

Found class PDSignature and how to sign a PDF, but not a solution to extract an visible signature as image.

Jonathan Barbero
  • 2,504
  • 1
  • 26
  • 47
ctvoigt
  • 125
  • 2
  • 8
  • Yes, it is possible, but no, it's not as easy as merely calling one method, but it's no magic either. Just study what happens in `PDPage.convertToImage` and `PageDrawer.drawPage.` In the latter method you see how after the page content the appearances of the page annotations are drawn. Essentially you'll have to find the annotations of signature fields and draw *only* them on canvas of *their* respective size. – mkl Jun 06 '13 at 10:41
  • two problems with that: 1. transparency is not respected (so i'll get visible capture + document at this position) 2. two overlapping signatures are not recoverable. matching done by posion of the added comment is not really clever. – ctvoigt Jun 06 '13 at 11:01
  • That's why you should look at those methods and (as I unfortunately did not spell out clearly) **take them as inspiration for own code that render only the signature annotations and each of them separately.** – mkl Jun 06 '13 at 12:52
  • Thanks for your reply, but it is complicated to implement. While evaluating an implementation I did not found a possibility to get the original size (without DPI) of the embedded image, don't know, if really needed, but my goal is to extract the image in original size, without any scaling. – ctvoigt Jun 17 '13 at 11:57
  • *to extract the image in original size* --- what do you mean by that? PDF by nature is not a rasterized format (even though it may contain rasterized images) but you want to render it to a rasterized format. Thus, **you** decide which resolution to use which implies a choice of size. – mkl Jun 17 '13 at 12:42

1 Answers1

10

As no one came up to answer, I tried my proposal in the comments to your question myself. A first result:

import java.awt.Color;
import java.awt.Dimension;
import java.awt.Graphics2D;
import java.awt.RenderingHints;
import java.awt.image.BufferedImage;
import java.io.IOException;
import java.lang.reflect.Field;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdfviewer.PageDrawer;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.PDGraphicsState;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAppearanceDictionary;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAppearanceStream;

public class AnnotationDrawer extends PageDrawer
{
    public AnnotationDrawer(int imageType, int resolution) throws IOException
    {
        super();
        this.imageType = imageType;
        this.resolution = resolution;
    }

    public Map<String, BufferedImage> convertToImages(PDPage p) throws IOException
    {
        page = p;
        final Map<String, BufferedImage> result = new HashMap<String, BufferedImage>();

        List<PDAnnotation> annotations = page.getAnnotations();
        for (PDAnnotation annotation: annotations)
        {
            String appearanceName = annotation.getAppearanceStream();
            PDAppearanceDictionary appearDictionary = annotation.getAppearance();
            if( appearDictionary != null )
            {
                if( appearanceName == null )
                {
                    appearanceName = "default";
                }
                Map<String, PDAppearanceStream> appearanceMap = appearDictionary.getNormalAppearance();
                if (appearanceMap != null) 
                {
                    PDAppearanceStream appearance = 
                        (PDAppearanceStream)appearanceMap.get( appearanceName ); 
                    if( appearance != null ) 
                    {
                        BufferedImage image = initializeGraphics(annotation);
                        setTextMatrix(null);
                        setTextLineMatrix(null);
                        getGraphicsStack().clear();
                        processSubStream( page, appearance.getResources(), appearance.getStream() );

                        String name = annotation.getAnnotationName();
                        if (name == null || name.length() == 0)
                        {
                            name = annotation.getDictionary().getString(COSName.T);
                            if (name == null || name.length() == 0)
                            {
                                name = Long.toHexString(annotation.hashCode());
                            }
                        }

                        result.put(name, image);
                    }
                }
            }
        }

        return result;
    }

    BufferedImage initializeGraphics(PDAnnotation annotation)
    {
        PDRectangle rect = annotation.getRectangle();
        float widthPt = rect.getWidth();
        float heightPt = rect.getHeight();
        float scaling = resolution / (float)DEFAULT_USER_SPACE_UNIT_DPI;
        int widthPx = Math.round(widthPt * scaling);
        int heightPx = Math.round(heightPt * scaling);
        //TODO The following reduces accuracy. It should really be a Dimension2D.Float.
        Dimension pageDimension = new Dimension( (int)widthPt, (int)heightPt );
        BufferedImage retval = new BufferedImage( widthPx, heightPx, imageType );
        Graphics2D graphics = (Graphics2D)retval.getGraphics();
        graphics.setBackground( TRANSPARENT_WHITE );
        graphics.clearRect( 0, 0, retval.getWidth(), retval.getHeight() );
        graphics.scale( scaling, scaling );
        setGraphics(graphics);
        pageSize = pageDimension;
        graphics.setRenderingHint( RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON );
        graphics.setRenderingHint( RenderingHints.KEY_FRACTIONALMETRICS, RenderingHints.VALUE_FRACTIONALMETRICS_ON );
        setGraphicsState(new PDGraphicsState(new PDRectangle(widthPt, heightPt)));

        return retval;
    }

    void setGraphics(Graphics2D graphics)
    {
        try {
            Field field = PageDrawer.class.getDeclaredField("graphics");
            field.setAccessible(true);
            field.set(this, graphics);
        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    private static final int DEFAULT_USER_SPACE_UNIT_DPI = 72;
    private static final Color TRANSPARENT_WHITE = new Color( 255, 255, 255, 0 );

    private int imageType;
    private int resolution;
}

If you want to render the annotations of a given PDPage page, you merely do:

AnnotationDrawer drawer = new AnnotationDrawer(8, 288);
Map<String, BufferedImage> images = drawer.convertToImages(page);

The constructor arguments correspond to those of PDPage.convertToImage(int imageType, int resolution).

Beware, this has

a. been hacked together based on PDFBox 1.8.2; it may contain version-specific code; b. merely been checked for some visible signature annotations I have here; it may be incomplete, and it may especially fail for other annotation types.

mkl
  • 90,588
  • 15
  • 125
  • 265