0

I'm trying to extract the signature image from a PDF signed with Adobe Sign. Not sure how Adobe adds this image.

enter image description here

I have tested in Java with iText and PDFBox. Normally when you traverse through the PDF structure with some tools e.g iText RUPS you can find the images like the visible signature and even download it directly in the tool. I didn't find it in this case. I did find though the XObject what I think holds the signature image.

enter image description here

The stream looks like this:

q
    q
        0.07843 0.45098 0.90196 RG
        0.07843 0.45098 0.90196 rg
        1 w
        q
            BT
                1 0 0 1 0 1.78 Tm
                /F1 6 Tf
                0.07843 0.45098 0.90196 rg
                ({0009002000270027002a00010018002a002d0027001f00010497000200300022000104400443047600010441043f044104420001044004410477043f044300010008000e001506d404410498}) Tj
                0 g
            ET
        Q
        0 8.28 m
            103.55 8.28 l
        S
    Q
    0 0 0 RG
    0 0 0 rg
    q
        1 0 0 1 0 0 cm
        /Xf2 Do
    Q
Q

And has a reference to Xf2 which looks like this:

1 J
1 j
1.000 w
30.315 10.580 m
    30.315 10.580 30.315 10.580 30.315 10.587 c
S
1.000 w
30.315 10.587 m
    30.315 10.594 30.315 10.608 30.315 10.627 c
S
1.000 w
30.315 10.627 m
    30.315 10.646 30.315 10.670 30.315 10.683 c
S
...

Can this be extracted as an image?

krillov
  • 79
  • 1
  • 14
  • When you say "image", you appear to think only of *bitmap images*. But there also are other kinds of images, in particular *vector graphics images* which are very well suited to show signatures. What you found in that file are vector graphics instructions laying out a path and stroking it. – mkl Aug 29 '23 at 22:54
  • @mkl Yes, I came to the same conclusion. I understood they are vector graphics. Is it possible to extract and convert them to bitmap format? – krillov Aug 30 '23 at 06:26
  • @KJ I would like to do all this in Java with PDFBox or iText. – krillov Aug 30 '23 at 06:29
  • 1
    Does the `AnnotationDrawer` for pdfbox from [this old answer](https://stackoverflow.com/a/17145649/1729265) help? It's being implemented based on pdfbox 1.8.2 but the concept should still be usable with the current pdfbox 3. – mkl Aug 30 '23 at 06:59
  • @mkl I have tried that but that code is really outdated. I can maybe test it again. I missed that they have released 3.0. – krillov Aug 30 '23 at 07:02
  • @mkl I have tested with PDFBox version 1.82 just to see if it works at all but unfortunately no. `PDAppearanceDictionary appearDictionary = annotation.getAppearance();` is null. – krillov Aug 30 '23 at 10:02
  • *"I have tested with PDFBox version 1.82 just to see if it works at all but unfortunately no."* - Well, apparently the signature appearances you are after are not annotation appearances but generic XObjects. Thus, you'll have to change the code to work on XObject resources of the page (maybe even recursively) instead on annotation appearances. As you don't share an example file, this is a bit of guesswork for me... – mkl Aug 30 '23 at 13:56
  • @mkl I have uploaded a test file here: https://file.io/5UQcJejlOWzQ – krillov Aug 30 '23 at 15:32
  • Ok, nested XObjects, so recursing through page XObject Resources is necessary. I'll take a look later this week. – mkl Aug 30 '23 at 17:02
  • Thanks! They are nested but they always have the same id Xi3 and Xf2 so a really ugly solution is like: `COSDictionary t = (COSDictionary) document.getDocument().getObjectsByType(COSName.ANNOT).get(0).getDictionaryObject(COSName.P); COSDictionary resources = (COSDictionary) t.getItem(COSName.RESOURCES); COSDictionary xObject = (COSDictionary) resources.getItem(COSName.XOBJECT); COSObject xObjectXi3 = (COSObject) xObject.getItem("Xi3"); COSDictionary resourcesXi3 = (COSDictionary) xObjectXi3.getItem(COSName.RESOURCES);...` But yes this would stop working if they change their implementation. – krillov Aug 30 '23 at 17:18
  • *"But yes this would stop working if they change their implementation"* - These scribbles are not marked in a distinct, standardized manner. Thus, no matter how you extract them, it will always be possible that Adobe changes the implementation in a way that breaks your extraction. (Unless you train some AI to recognize where the signature scribbles are and render exactly those regions. But then the AI may sometimes err...) – mkl Sep 01 '23 at 05:51

1 Answers1

0

Your Question was can a signature squiggle (vector graphic) be extracted as an image?

So there are more unknows than answers, in such a question, like where do you need the extraction to be or converted into?

Here on left we can see the multiple components of a signature annotation. and once we know what nested object is the artwork (here number 14 0 obj) we can copy the stream data into another PDF. As seen on the right.

enter image description here

since it is a screen graphic we can copy and paste to any graphics application and save in any image format you wish. However to maintain the simplicity of the source it is best copied into an SVG editor like Inkscape.

Or simpler yet since it is now an isolated PDF object simply convert to SVG for forensic use in a web page, or on the bosses debit account, etc. mutool convert -o output.svg input.pdf

enter image description here

K J
  • 8,045
  • 3
  • 14
  • 36