1

Is there a way for me to be able to grab a PDF and somehow parse it and show it via form in Word via a UserForm? For example, I have a form where I can put the link to an online PDF like say www.website.com/file.pdf, and then the UserForm parses that PDF and shows it as plain text on a listbox perhaps? I dont need the code for it but only to know whether this is even remotely possible and if so, a few tips on how I could go about it would be fine.

It's a long shot I know, and this could not even be possible. But if you guys can help me out on this one then that would be great! Thanks in advance!

SandPiper
  • 2,816
  • 5
  • 30
  • 52
decrementor
  • 163
  • 3
  • 14
  • May be of interest: http://stackoverflow.com/questions/83152/reading-pdf-documents-in-net, for example, http://pdfbox.apache.org/ mentioned in the thread, includes command line utilities. I have not tried any of the suggested solutions with VBA. – Fionnuala Mar 11 '11 at 11:19

1 Answers1

1

PDF's are difficult to parse. I have a few programs that use Foolabs xpdf (http://www.foolabs.com/xpdf/home.html) command line utility. I setup a batch file to convert a specific named file into a textfile. From my vba program I move my desired pdf to the location of the batch file. I trigger the batch file from my vba program using the Shell & Wait command(s). Then I parse the resulting textfile.

batch file looks like this:

pdftotext.exe -layout YourPage.pdf

Shell and Wait can be found here: http://www.cpearson.com/excel/ShellAndWait.aspx

Tying it all together: http://vbaexpress.com/kb/getarticle.php?kb_id=977

I'm not sure if this helps in your situation, but its the only thing that I can think of unless you go with trying to read the pdf file directly.

Community
  • 1
  • 1
Fink
  • 3,356
  • 19
  • 26