21

I have app that needs to retrieve some data (signer name) from digital signature "attached" on PDF files.

I have found only examples in Java and C# using the iText class AcroFields method GetSignatureNames

edit: I've tried pdftk with dump_data_fields and generate_fpdf and the result was that (unfortunately):

/Fields [
<<
/V /dftk.com.lowagie.text.pdf.PdfDictionary@3048918
/T (Signature1)
>>]

and

FieldType: Signature
FieldName: Signature1
FieldFlags: 0
FieldJustification: Left

Thanks in Advance !

celsowm
  • 846
  • 9
  • 34
  • 59

4 Answers4

19

Well, it's complicated (I would say even impossible, but who knows) to achieve this only with PHP.

At first, please read article about digital signature in Adobe PDF

Second, after reading this you will know that signature is stored between b and c bytes according to /ByteRange[a b c d] indicator

Third, we can extract b and c from document and then extract signature itself (guide says it will be hexdecoded PKCS7# object).

<?php

 $content = file_get_contents('test.pdf');

 $regexp = '#ByteRange\[\s*(\d+) (\d+) (\d+)#'; // subexpressions are used to extract b and c

 $result = [];
 preg_match_all($regexp, $content, $result);

 // $result[2][0] and $result[3][0] are b and c
 if (isset($result[2]) && isset($result[3]) && isset($result[2][0]) && isset($result[3][0]))
 {
     $start = $result[2][0];
     $end = $result[3][0];
     if ($stream = fopen('test.pdf', 'rb')) {
         $signature = stream_get_contents($stream, $end - $start - 2, $start + 1); // because we need to exclude < and > from start and end

         fclose($stream);
     }

     file_put_contents('signature.pkcs7', hex2bin($signature));
}

Forth, after third step we have PKCS#7 object in file signature.pkcs7. Unfortunately, I don't know methods to extract information from signature using PHP. So you must be able to run shell commands to use openssl

openssl pkcs7 -in signature.pkcs7 -inform DER -print_certs > info.txt

After running this command in file info.txt you will have a chain of certificates. Last one is the one you need. You can see the structure of the file and parse needed data.

Please also refer to this question, this question and this topic

EDIT at 2017-10-09 I knowingly advised you to see exactly this question There is a code that you can adjust to your needs.

use ASN1\Type\Constructed\Sequence;
use ASN1\Element;
use X509\Certificate\Certificate;       

$seq = Sequence::fromDER($binaryData);
$signed_data = $seq->getTagged(0)->asExplicit()->asSequence();
// ExtendedCertificatesAndCertificates: https://tools.ietf.org/html/rfc2315#section-6.6
$ecac = $signed_data->getTagged(0)->asImplicit(Element::TYPE_SET)->asSet();
// ExtendedCertificateOrCertificate: https://tools.ietf.org/html/rfc2315#section-6.5
$ecoc = $ecac->at($ecac->count() - 1);
$cert = Certificate::fromASN1($ecoc->asSequence());
$commonNameValue = $cert->tbsCertificate()->subject()->toString();
echo $commonNameValue;

I've adjusted it for you, but please make the rest by yourself.

nowox
  • 25,978
  • 39
  • 143
  • 293
Denis Alimov
  • 2,861
  • 1
  • 18
  • 38
  • unfortunately the preg_match_all fails to this pdf with my signature: https://www.docdroid.net/oVB1AW2/ – celsowm Oct 06 '17 at 14:58
  • 2
    @celsown Why you did not bother to take a look to the contents of your test.pdf file? Your ByteRange indicator is a bit different, you just need to slightly change `$regexp = '#ByteRange\s*\[(\d+) (\d+) (\d+)#';` – Denis Alimov Oct 06 '17 at 15:27
  • the new regex worked, thanks. Now I am trying to use PHPASN1 to decode the DER because apparently PHP openssl does not work with DER – celsowm Oct 06 '17 at 16:01
  • @DenisAlimov you just led this guy down a _very_ deep rabbit-hole, and back out of it, with 3 specific instructions. Well done! – Tony Chiboucas Oct 06 '17 at 19:12
  • 1
    @TonyChiboucas yep, thank you. but I would rather implement java service using 3rd party libraries for this issue. and use it via shell command – Denis Alimov Oct 07 '17 at 17:26
  • 1
    If you're going to call it via shell, why not just write a shell script instead? – Tony Chiboucas Oct 09 '17 at 13:41
  • $commonNameValue = $cert->tbsCertificate()->subject()->firstValueOf("commonName"); var_dump($commonNameValue->stringValue()); – celsowm Oct 09 '17 at 20:04
  • @celsowm please see how to extract common name from signer cerificate – Denis Alimov Oct 10 '17 at 07:59
  • @celsowm Thanks for updated regex. It saves my day. Thank you very much. – Tejas Patel Mar 26 '21 at 05:26
  • @DenisAlimov It's actually possible to achieve this with PHP only. Also, your solution doesn't work for PDF files signed by 2 or more people. Please, take a look at: https://stackoverflow.com/a/75891895/1657502 – Antônio Medeiros Mar 30 '23 at 19:18
3

This is my working code in PHP7:

<?php


require_once('vendor/autoload.php');

use Sop\ASN1\Type\Constructed\Sequence;
use Sop\ASN1\Element;
use Sop\X509\Certificate\Certificate;  



$currentFile = "./upload/test2.pdf";


$content = file_get_contents($currentFile);


$regexp = '/ByteRange\ \[\s*(\d+) (\d+) (\d+)/'; // subexpressions are used to extract b and c

$result = [];
preg_match_all($regexp, $content, $result);

// $result[2][0] and $result[3][0] are b and c
if (isset($result[2]) && isset($result[3]) && isset($result[2][0]) && isset($result[3][0])) {
    $start = $result[2][0];
    $end = $result[3][0];
    if ($stream = fopen($currentFile, 'rb')) {
        $signature = stream_get_contents($stream, $end - $start - 2, $start + 1); // because we need to exclude < and > from start and end

        fclose($stream);
    }

    
    $binaryData = hex2bin($signature);
    
    $seq = Sequence::fromDER($binaryData);
    $signed_data = $seq->getTagged(0)->asExplicit()->asSequence();
    // ExtendedCertificatesAndCertificates: https://tools.ietf.org/html/rfc2315#section-6.6
    $ecac = $signed_data->getTagged(0)->asImplicit(Element::TYPE_SET)->asSet();
    // ExtendedCertificateOrCertificate: https://tools.ietf.org/html/rfc2315#section-6.5
    $ecoc = $ecac->at($ecac->count() - 1);
    $cert = Certificate::fromASN1($ecoc->asSequence());
    $commonNameValue = $cert->tbsCertificate()->subject()->toString();
    echo $commonNameValue;

    
}
4g0st1n0
  • 31
  • 1
0

Similar to the solution proposed by @Denis Alimov, but using just PHP functions (instead of the openssl command) and no Composer dependencies:

<?php
function der2pem($der_data) {

    // https://www.php.net/manual/en/ref.openssl.php

    $pem = chunk_split(base64_encode($der_data), 64, "\n");
    $pem = "-----BEGIN CERTIFICATE-----\n".$pem."-----END CERTIFICATE-----\n";
    return $pem;
}

function extract_pkcs7_signatures($path_to_pdf) {

    // https://stackoverflow.com/q/46430367

    $pdf_contents = file_get_contents($path_to_pdf);

    $regexp = '/ByteRange\ \[\s*(\d+) (\d+) (\d+)/';

    $result = [];
    preg_match_all($regexp, $pdf_contents, $result);

    $signatures = [];

    if (isset($result[0])) {
        $signature_count = count($result[0]);
        for ($s = 0; $s < $signature_count; $s++) {
            $start = $result[2][$s];
            $end = $result[3][$s];
            $signature = null;
            if ($stream = fopen($path_to_pdf, 'rb')) {
                $signature = stream_get_contents($stream, $end - $start - 2, $start + 1);
                fclose($stream);
                $signature = hex2bin($signature);
                $signatures[] = $signature;
            }
        }
    }

    return $signatures;
}

function who_signed($path_to_pdf) {

    // https://www.php.net/manual/en/openssl.certparams.php
    // https://www.php.net/manual/en/function.openssl-pkcs7-read.php
    // https://www.php.net/manual/en/function.openssl-x509-parse.php

    $signers = [];

    $pkcs7_der_signatures = extract_pkcs7_signatures($path_to_pdf);
    if (!empty($pkcs7_der_signatures)) {
        $parsed_certificates = [];
        foreach ($pkcs7_der_signatures as $pkcs7_der_signature) {
            $pkcs7_pem_signature = der2pem($pkcs7_der_signature);
            $pem_certificates = [];
            $result = openssl_pkcs7_read($pkcs7_pem_signature, $pem_certificates);
            if ($result) {
                foreach ($pem_certificates as $pem_certificate) {
                    $parsed_certificate = openssl_x509_parse($pem_certificate);
                    $parsed_certificates[] = $parsed_certificate;
                }
            }
        }

        // Remove certificate authorities certificates

        $people_certificates = [];
        foreach ($parsed_certificates as $certificate_a) {
            $is_authority = false;
            foreach ($parsed_certificates as $certificate_b) {
                if ($certificate_a['subject'] == $certificate_b['issuer']) {
                    // If certificate A is of the issuer of certificate B, then
                    // certificate A belongs to a certificate authority and,
                    // therefore, should be ignored
                    $is_authority = true;
                    break;
                }
            }
            if (!$is_authority) {
                $people_certificates[] = $certificate_a;
            }
        }

        // Remove duplicate certificates

        $distinct_certificates = [];
        foreach ($people_certificates as $certificate_a) {
            $is_duplicated = false;
            if (count($distinct_certificates) > 0) {
                foreach ($distinct_certificates as $certificate_b) {
                    if (
                        ($certificate_a['subject'] == $certificate_b['subject']) &&
                        ($certificate_a['serialNumber'] == $certificate_b['serialNumber']) &&
                        ($certificate_a['issuer'] == $certificate_b['issuer'])
                    ) {
                        // If certificate B has the same subject, serial number
                        // and issuer as certificate A, then certificate B is a
                        // duplicate and, therefore, should be ignored
                        $is_duplicated = true;
                        break;
                    }
                }
            }
            if (!$is_duplicated) {
                $distinct_certificates[] = $certificate_a;
            }
        }

        foreach ($distinct_certificates as $certificate) {
            $signers[] = $certificate['subject']['CN'];
        }
    }

    return $signers;
}

$path_to_pdf = 'test.pdf';

// In case you want to test the extract_pkcs7_signatures() function:

/*
$signatures = extract_pkcs7_signatures($path_to_pdf);
for ($s = 0; $s < count($signatures); $s++) {
    $path_to_pkcs7 = pathinfo($path_to_pdf, PATHINFO_FILENAME) . $s . '.pkcs7';
    file_put_contents($path_to_pkcs7, $signatures[$s]);
    echo shell_exec("openssl pkcs7 -inform DER -in $path_to_pkcs7 -print_certs -text");
}
exit;
*/

var_dump(who_signed($path_to_pdf));
?>

For some test1.pdf, signed by just one person (let's call her ALICE), this script should return:

array(1) {
  [0]=>
  string(5) "ALICE"
}

For some test2.pdf, signed by two people (let's call them BOB and CAROL), this script should return:

array(2) {
  [0]=>
  string(3) "BOB"
  [1]=>
  string(5) "CAROL"
}

For more information, take a look at this question of mine:

Antônio Medeiros
  • 3,068
  • 1
  • 27
  • 22
-2

I've used iText and found it to be very reliable, I highly recommend it. you can always call the java code as a "microservice" from PHP.

Felipe Valdes
  • 1,998
  • 15
  • 26
  • Please share more details, like some code that could help to understand your answer – Nico Haase Mar 30 '23 at 14:48
  • At the time I wrote the code, the code belonged to an employer, so I can't share it. java can be called from php, as for the code examples, i Think iText has good docs online, I would invite those interested to review iText, although, who knows, 5 years later maybe there is now a PHP way of doing it which actually parses the binary the right way. – Felipe Valdes Mar 31 '23 at 01:58