0

Based on this resource (PHP - how to get the signer(s) of a digitally signed PDF?) - I am trying to have a system in PHP that retrieves the digital signature of a file, so that later on I can verify if it is valid. The signature is correctly retrieved from the file, and this is cut example (the input):

/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>

This will be used by a preg_match_all() where the content comes from the pdf signature at the end of the file, and used the pattern will be

/\/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] \/Contents\<\s*(\w+)>/is

The problem is that when I do the preg_match_all()...

preg_match_all($regexp, $signature_content, $result);

.. what I get doing var_dump() is an array with empty(?) values

array(6) { [0]=> array(0) { } [1]=> array(0) { } [2]=> array(0) { } [3]=> array(0) { } [4]=> array(0) { } [5]=> array(0) { } } array(0) { }

But with the same code, if I replace $signature_content with the copied/pasted string - it works and I get the array

array(6) { [0]=> array(1) { [0]=> string(11784) "/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010..." } } array(1) { [0]=> string(23) "username_test" }

Anyone has any idea on how to solve this? Thanks in advance!

edit: here is the commented code - https://pastebin.com/jw1uw7Gb

OpenStudio
  • 41
  • 1
  • 1
  • 6
  • As an aside, the **ByteRange** entry needs not directly precede the **Contents** entry, there may be other entries or comments in-between, or the **ByteRange** may come _after_ the **Contents**... Furthermore, there are different ways to write the names if you use #-escapes. – mkl Jul 03 '23 at 16:21

1 Answers1

0

You use (\w+)> at the end of your pattern, but the same string ends with ba630...> and using \w does not match a dot.

Note that if you change the pattern delimiter to for example ~ then you don't have to escape the /

You also don't have to escape \<

What you can do to match both variants is to match optional dots at the end of the pattern outside of the last capture group

/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] /Contents<\s*(\w+)\.*>

See a regex demo.

$signature_content = <<<DATA
/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>
DATA;
$regexp = "~/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] /Contents<\s*(\w+)\.*>~i";
preg_match_all($regexp, $signature_content, $result);

var_dump($result);

Output

array(6) {
  [0]=>
  array(1) {
    [0]=>
    string(85) "/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>"
  }
  [1]=>
  array(1) {
    [0]=>
    string(1) "0"
  }
  [2]=>
  array(1) {
    [0]=>
    string(5) "49443"
  }
  [3]=>
  array(1) {
    [0]=>
    string(5) "61187"
  }
  [4]=>
  array(1) {
    [0]=>
    string(4) "6424"
  }
  [5]=>
  array(1) {
    [0]=>
    string(40) "30820bb506092a864886f70d010702a0820ba630"
  }
}
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • The "630..." was just intended to cut the signature, which is way longer than this - the signature is made just of alphanumeric characters. So I am afraid the problem is not there. The problem is that if I "copy-paste" the content, the script works fine, but if the content is read from a variable (which btw works), then the script returns the "empty" array – OpenStudio Jul 04 '23 at 08:33
  • @OpenStudio Can you update the question with those details? I can only work with the data that I can see in this case. – The fourth bird Jul 04 '23 at 08:34
  • @OpenStudio What does the var_dump of that "variable" give you? – The fourth bird Jul 04 '23 at 08:59
  • The information are already in the question, right after the 'horizontal line'. The problem is that passing $signature_content from the workflow returns this `array(6) { [0]=> array(0) { } [1]=> array(0) { } [2]=> array(0) { } [3]=> array(0) { } [4]=> array(0) { } [5]=> array(0) { } } array(0) { }` - while if I copy paste the value of $signature_content and manually replace it instead of using the variable, I get this `array(6) { [0]=> array(1) { [0]=> string(11784) "/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010..." } } array(1) { [0]=> string(23) "username_test" }` – OpenStudio Jul 04 '23 at 12:36
  • Maybe this can help, I will also include it in the question - https://pastebin.com/jw1uw7Gb – OpenStudio Jul 04 '23 at 12:45