2

I have got a PDF file and associated password.

I would to convert an encrypted file to a clear version using python only.

I found here some python modules (pyPdf2 , PDFMiner) to treat PDF file but none of them will work with encryption.

Someone have already done this ?

Community
  • 1
  • 1
A. STEFANI
  • 6,707
  • 1
  • 23
  • 48

2 Answers2

2

Now pyPDF2 support encryption, according to this answer, it may be implemented like this:

import os
import PyPDF2
from PyPDF2 import PdfFileReader

fp = open(filename)
pdfFile = PdfFileReader(fp)
password = "mypassword"
if pdfFile.isEncrypted:
    try:
        pdfFile.decrypt(password)
        print('File Decrypted (PyPDF2)')
    except:
        command = ("cp "+ filename +
            " temp.pdf; qpdf --password='' --decrypt temp.pdf " + filename
            + "; rm temp.pdf")
        os.system(command)
        print('File Decrypted (qpdf)')
        fp = open(filename)
        pdfFile = PdfFileReader(fp)
else:
    print('File Not Encrypted')

Note that this code, use pyPDF2 by default and failback to qpdf in case of issue.

A. STEFANI
  • 6,707
  • 1
  • 23
  • 48
1

You'd also need to know the encryption algorithm and key length to be able to advise which tool might work... and depending on the answers, a python library may not be available.

joelgeraci
  • 4,606
  • 1
  • 12
  • 19
  • Do you mean by your answer, that the encryption algorithm and key length depend of where the PDF was previously created ? – A. STEFANI Aug 10 '16 at 08:22
  • No - What I mean is that PDF files can be encrypted using RC4 or AES algorithms or even an unpublished algorithm with key length ranging from 40 to 128. Not all library tools support all variations. – joelgeraci Aug 10 '16 at 15:41
  • Nobody had try to reuse how an open soure pdf reader detect encryption algortithm and key lenght ? Are you sure that there is no standard ? – A. STEFANI Jan 03 '18 at 21:14