I have a PDF file that I am trying to parse text out of. I opened the file using Microsoft Word, and text I need is in the header. On the first page, the header is justified left with a center tab that has the text (plain English name document title instead of the complicated reference name) that I am trying to grab. There is a right tab that has a page number control that I don't care about.
When I try to run the following:
Debug.Print ThisDocument.Sections(1).Headers(wdHeaderFooterPrimary).Exists
it gives me True
, so I know the header exists. However, when I try to run
Debug.Print ThisDocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text
it gives me nothing but an empty string, which I can further confirm by wrapping it in a Len(…)
command which gives me 1
. How can I get the text out of the header?
Of note, I tried using some Adobe SDK functions which would have been easier, but I do not have the professional Acrobat suite so I do not have access to those tools. Hence the MS Word workaround.