0

I am learning to use a software package that only issues its manual as a web page, http://www.orcina.com/SoftwareProducts/OrcaFlex/Documentation/OrcFxAPIHelp/Default_Left.htm#StartTopic=html/Matlab_Introduction.htm
If I could convert the web manual to one PDF file, I would learn the package quicker since I could mark up the PDF with notes and underlines in Acrobat Pro. I tried printing each section out to an individual PDF then concatenating them into one PDF, but there are >100 sections so this is slow.
Is there a better way to convert the whole web manual document to a single PDF file, with the manual contents in the right order?

KAE
  • 815
  • 1
  • 13
  • 32

2 Answers2

1

In acrobat pro, you should be able to open the web page directly and it'll convert the pages to PDF on the fly for you.

ctrl+shift+O (oh, not zero, and that works in v9 and vX both)

I believe you can also tell it to spider outward to a certain degree. Yep... only that's not working. I get a blank page. Looks like most of the content is filled in via script/ajax type stuff.

Not a programming solution, but a solution none the less.

wkHTMLToPDF will handle script, but I don't know if it'll do any spidering for you.

Mark Storer
  • 15,672
  • 3
  • 42
  • 80
  • Thanks for trying it! Looks like I will need to open each page of the manual to make the PDF but at least it is fast from within Acrobat. – KAE Feb 10 '11 at 15:59
0

You should use an HTML/XML parser to screen scrape each of the pages, store the entire document in some local data structure, then use that to paste the content into your PDF library and save the document as a PDF.

acconrad
  • 3,201
  • 1
  • 22
  • 31
  • Thanks, glad to hear it can be done. But I've never done any of these steps, so details would be helpful, such as favorite free HTML parser, whether this could be done in perl (my skill is basic), or how to go from the data structure to PDF. I'll look around on the web to see if there is a similar bit of code somewhere. – KAE Feb 09 '11 at 17:13