3
pdfile=open("tutorial.pdf","r")
xyz= pdfile.readlines()
pqr=pdfile.readline()
for a in xyz:
    print a

this code doesnot display actual content. Instead it displays some question marks and boxes.

3 Answers3

3

PDF files contain formatted data, you cannot read directly,

so use pypdf module! click here https://pypi.org/project/pypdf/ Install and you can read without converting.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
no1
  • 717
  • 2
  • 8
  • 21
2

A PDF file is not plain text - you can't just print its bytes to the terminal. You'd need to use a PDF-reading library (see Python PDF library for some suggestions) to read it.

Community
  • 1
  • 1
RichieHindle
  • 272,464
  • 47
  • 358
  • 399
1

If you are working with textual PDF files, I would suggest using PDFMiner. (A complete example can be found here: https://github.com/syllabs/pdf2text)

user1498724
  • 206
  • 2
  • 5