Anyone know a simple way to "read"/extract keywords from .pdf file? This file is not password protected and it was generated on the same server usinf FPDF class.
I know there is some "powerful" tool (not free) to manipulate .pdf that provide a simple way to get out all the metadata.
I also know that .pdf store all metadata inside << >> character, using the special character / before the name of metadata to identify that. What I need is the string after the "/Keywords" and store in a variable.
Any idea to parse and get only that string?
(currently I'm writing a JSON string inside keywords, so it's look like this: ([{"FirstName":"7bis","LastName":"lastName","email":"email@email.com"}])
)
Opening the pdf file with a text editor looks like:
/F1 6 0 R
>>
/XObject <<
>>
>>
endobj
7 0 obj
<<
/Keywords ([{"FirstName":"7bis","LastName":"lastName","email":"email@email.com"}])
/Producer (FPDF 1.81)
/CreationDate (D:20160531084015)
>>
endobj
Thanks for all suggestion ;)