7

Is it possible to open a pdf from within Python such that it goes to a specific page or section? What I am thinking is to have it open a help file (pdf) and jump to the section that the help is being requested for.

Das.Rot
  • 638
  • 4
  • 11
  • 25
  • 2
    What would you be using to open the file? – David Z Apr 04 '12 at 15:46
  • Take a look at pypdf. http://pybrary.net/pyPdf/ – Jack_of_All_Trades Apr 04 '12 at 15:46
  • I would like to use the default pdf reader to open the file. I am pretty sure this would be Adobe Reader 99.9% of the time. Perhaps there is a way to open Adobe Reader to a specific page? I now suspect I will have to do this through Adobe Reader, and not necessarily through Python. I will also probably have to put the Python code in a `try` block in case the user does not have Adobe Reader installed. Does anybody know of a argument to send Adobe Reader to open to a specific page? – Das.Rot Apr 04 '12 at 16:46
  • 2
    It seems that there is no argument to open Adobe Reader on a certain page: http://stackoverflow.com/a/619203/183791 – dusan Apr 04 '12 at 17:24

1 Answers1

9

Here are two basic ideas

Case 1: you want to open the file in Python

from pyPdf import PdfFileReader, PageObject

pdf_toread = PdfFileReader(path_to_your_pdf)

# 1 is the number of the page
page_one = pdf_toread.getPage(1)

# This will dump the content (unicode string)
# According to the doc, the formatting is dependent on the
# structure of the document
print page_one.extractText()

As for the section, you can have a look to this answer

Case 2: you want to call acrobat to open your file at a specific page

From this Acrobat help document, you can pass this to a subprocess:

import subprocess
import os

path_to_pdf = os.path.abspath('C:\test_file.pdf')
# I am testing this on my Windows Install machine
path_to_acrobat = os.path.abspath('C:\Program Files (x86)\Adobe\Reader 10.0\Reader\AcroRd32.exe') 

# this will open your document on page 12
process = subprocess.Popen([path_to_acrobat, '/A', 'page=12', path_to_pdf], shell=False, stdout=subprocess.PIPE)
process.wait()

Just a suggestion: if you want to open the file at a specific section, you could use the parameter search=wordList where wordlist is a list of words seperated by spaces. The document will be opened and the search will be performed, the first result of it being highlighted. This way, as a wordlist, you can try to put the name of the section.

Community
  • 1
  • 1
Marc-Olivier Titeux
  • 1,209
  • 3
  • 13
  • 24
  • It is not necessary to open and read the PDF in Python. I would like to open the PDF with the default PDF reader and go to a specific page. – Das.Rot Apr 04 '12 at 16:47
  • I am having a problem with this. My acrobat reader opens up and says "there was an error in opening the document. the filename directory name or volume label syntax is incorrect". I am unable to find a solution to this or even identify the error. – Dev_Man May 21 '17 at 04:44
  • ```print(pageObj.extractText())``` return None or '', why? – Manuel Carrero Dec 30 '21 at 18:00