0

I'm required to analyze the comments and bookmarks of several PDF files in my PHP application. Is there any way to extract this information?

All I need is bookmarks name + hierarchy and comments content + coordinates.

I would prefer a PHP library but I could also install additional software on the server and call it with exec().

  • you have several libraries that could do the trick here: http://stackoverflow.com/questions/1004478/read-pdf-files-with-php – Kaddath Mar 08 '17 at 11:17
  • ty, I tried PdfParser, but couldn't find out how to read bookmarks. –  Mar 09 '17 at 09:11

1 Answers1

0

Ok, https://github.com/smalot/pdfparser seems to be able to extract bookmarks as well as annotations. At least it provides a huge array, containing the desired data.

$parser = new \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('document.pdf');

print_r($pdf->getObjects());

All I have to do now is finding out how to process this array...