0

I am using PDFBox to determine pdf file is password protected or not. this is my code:

boolean isProtected = pdfDocument.isEncrypted();

My file properties is in sceenshot. Here i am getting isProtected= true even i can open it without password.

Note: this file has Document Open password : No and permission password : Yes.

click here to view file

ArK
  • 20,698
  • 67
  • 109
  • 136
Nitin
  • 2,701
  • 2
  • 30
  • 60
  • @Tilman in his answer correctly described the situation you are in. But is that what you wanted? Your "question" misses an actual question. – mkl Sep 19 '16 at 13:08

2 Answers2

5

Your PDF has an empty user password and a non empty owner password. And yes, it is encrypted. This is being done to prevent people to do certain things, e.g. content copying.

It isn't a real security; it is the responsibility of the viewer software to take care that the "forbidden" operations aren't allowed.

You can find a longer (and a bit amusing) explanation here.

To see the document access permissions, use PDDocument.getCurrentAccessPermission().

In 2.0.*, a user will be able to view a file if this call succeeds:

PDDocument doc = PDDocument.load(file);

If a InvalidPasswordException is thrown, then it means a non empty password is required.

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
  • Thanks for replay, I want true if user can view pdf without password and false if user can not open pdf file without password.Can u plz help me here. In my above scenario user can open/view pdf file without password then it should return true. – Nitin Sep 19 '16 at 13:12
  • In 2.0.* if you can open the file with PDDocument.load(File), then it means a user can view it. – Tilman Hausherr Sep 19 '16 at 13:28
  • Hello,I am ready to migrate pdfbox to 2.0.2. What is equivalent method of pDFTextStripper.resetEngine() in pdfbox 2.0.2 – Nitin Sep 20 '16 at 05:33
  • @NitinVavdiya there isn't (its private). You don't need it. – Tilman Hausherr Sep 21 '16 at 08:55
1

I am posting this answer because elsewhere on Stack Overflow and the web you might see the suggested way to check for a password protected PDF in PDFBox is to use PDDocument#isEncrypted(). The problem we found with this is that certain PDFs which did not prompt for a password were still being flagged as encrypted. See the accepted answer for one explanation of why this is happening, but in any case we used the followed pattern as a workaround:

boolean isPDFReadable(byte[] fileContent) {
    PDDocument doc = null;
    try {
        doc = PDDocument.load(fileContent);
        doc.getPages();  // perhaps not necessary
        return true;
    }
    catch (InvalidPasswordException invalidPasswordException) {
        LOGGER.error("Unable to read password protected PDF.", invalidPasswordException);
    }
    catch (IOException io) {
        LOGGER.error("An error occurred while reading a PDF attachment during account submission.", io);
    }
    finally {
        if (!Objects.isNull(doc)) {
            try {
                doc.close();
                return true;
            }
            catch (IOException io) {
                LOGGER.error("An error occurred while closing a PDF attachment ", io);
            }
        }
    }

    return false;
}

If the call to PDDocument#getPages() succeeds, then it also should mean that opening the PDF via double click or browser, without a password, should be possible.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • According to the accepted answer (and also to a quick&dirty code review) already the successful `PDDocument.load(fileContent)` operation suffices, no need to retrieve the pages. – mkl Feb 20 '19 at 09:11
  • @mkl I added a comment to your point. In any case, the accepted answer is basically lacking in implementation, which my answer provides. – Tim Biegeleisen Feb 20 '19 at 09:12