11

I need to be able to remove the security/encryption from some PDF documents, preferably with the itextsharp library. This used to be possible (How to decrypt a pdf file by supplying password of the file as argument using c#?), but a more recent change to the library means that solution no longer works.

I know this can be done with the Aspose PDF library (example), but that appears to be an expensive option.

Edit

So all this time I thought I was in possession of the owner password for the document I was using to test this. But in fact the password I had was the user password. The reason I thought it was the owner password was because it worked as the owner password and other values did not work. I believe the reason the user password worked in place of the user password was the fact that the PdfReader.unethicalreading field was set to true (it's a global flag that happened to be set elsewhere in code).

Community
  • 1
  • 1
Daniel Pratt
  • 12,007
  • 2
  • 44
  • 61
  • I have written a comprehensive answer to your question and I now understand why somebody would want to close or downvote it (although I didn't cast a vote myself). – Bruno Lowagie Jan 10 '15 at 14:26
  • 1
    @Daniel *PdfReader.unethicalreading* - that flag should only be used in special situations and its use should be documented and known by all people collaborating on the project. It can have unwanted side effects after all. – mkl Jan 11 '15 at 14:54

2 Answers2

12

In order to test code to encrypt a PDF file, we need a sample PDF that is encrypted. We'll create such a file using the EncryptPdf example.

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.setEncryption("Hello".getBytes(), "World".getBytes(),
        PdfWriter.ALLOW_PRINTING, PdfWriter.ENCRYPTION_AES_128 | PdfWriter.DO_NOT_ENCRYPT_METADATA);
    stamper.close();
}

With this code, I create an encrypted file hello_encrypted.pdf that I will use in the first example demonstrating how to decrypt a file.

Your original question sounds like "How can I decrypt a PDF document with the owner password?"

That is easy. The DecryptPdf example shows you how to do this:

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src, "World".getBytes());
    System.out.println(new String(reader.computeUserPassword()));
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.close();
    reader.close();
}

We create a PdfReader instance passing the owner password as the second parameter. If we want to know the user password, we can use the computeUserPassword() method. Should we want to encrypt the file, than we can use the owner password we know and the user password we computed and use the setEncryption() method to reintroduce security.

However, as we didn't do this, all security is removed, which is exactly what you wanted. This can be checked by looking at the hello.pdf document.

One could argue that your question falls in the category of "It doesn't work" questions that can only be answered with an "it works for me" answer. One could vote to close your question because you didn't provide a code sample that can be use to reproduce the problem, whereas anyone can provide a code sample that proves you wrong.

Fortunately, I can read between the lines, so I have made another example.

Many PDFs are encrypted without a user password. They can be opened by anyone, but encryption is added to enforce certain restrictions (e.g. you can view the document, but you can not print it). In this case, there is only an owner password, as is shown in the EncryptPdfWithoutUserPassword example:

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.setEncryption(null, "World".getBytes(),
        PdfWriter.ALLOW_PRINTING, PdfWriter.ENCRYPTION_AES_128 | PdfWriter.DO_NOT_ENCRYPT_METADATA);
    stamper.close();
    reader.close();
}

Now we get a PDF that is encrypted, but that can be opened without a user password: hello_encrypted2.pdf

We still need to know the owner password if we want to manipulate the PDF. If we don't pass the password, then iText will rightfully throw an exception:

Exception in thread "main" com.itextpdf.text.exceptions.BadPasswordException: Bad user password
    at com.itextpdf.text.pdf.PdfReader.readPdf(PdfReader.java:681)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:181)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:230)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:207)
    at sandbox.security.DecryptPdf.manipulatePdf(DecryptPdf.java:26)
    at sandbox.security.DecryptPdf.main(DecryptPdf.java:22)

But what if we don't remember that owner password? What if the PDF was produced by a third party and we do not want to respect the wishes of that third party?

In that case, you can deliberately be unethical and change the value of the static unethicalreading variable. This is done in the DecryptPdf2 example:

public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    PdfReader.unethicalreading = true;
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.close();
    reader.close();
}

This example will not work if the document was encrypted with a user and an owner password, in that case, you will have to pass at least one password, either the "owner password" or the "user password" (the fact that you have access to the PDF using only the "user" password is a side-effect of unethical reading). If only an owner password was introduced, iText does not need that owner password to manipulate the PDF if you change the unethicalreading flag.

However: there used to be a bug in iText that also removed the owner password(s) in this case. That is not the desired behavior. In the first PdfDecrypt example, we saw that we can retrieve the user password (if a user password was present), but there is no way to retrieve the owner password. It is truly secret. With the older versions of iText you refer to, the owner password was removed from the file after manipulating it, and that owner password was lost for eternity.

I have fixed this bug and the fix is in release 5.3.5. As a result, the owner password is now preserved. You can check this by looking at hello2.pdf, which is the file we decrypted in an "unethical" way. (If there was an owner and a user password, both are preserved.)

Based on this research, I am making the assumption that your question is incorrect. You meant to ask: "How can I decrypt a PDF document without the owner password?" or "How can I decrypt a PDF with the user password?"

It doesn't make sense to unfix a bug that I once fixed. We will not restore the (wrong) behavior of the old iText versions, but that doesn't mean that you can't achieve what you want. You'll only have to fool iText into thinking that the PDF wasn't encrypted.

This is shown in the DecryptPdf3 example:

class MyReader extends PdfReader {
    public MyReader(String filename) throws IOException {
        super(filename);
    }
    public void decryptOnPurpose() {
        encrypted = false;
    }
}
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
    MyReader.unethicalreading = true;
    MyReader reader = new MyReader(src);
    reader.decryptOnPurpose();
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    stamper.close();
    reader.close();
}

Instead of PdfReader, we are now using a custom subclass of PdfReader. I have named it MyReader and I have added an extra method that allows me to set the encrypted variable to false.

I still need to use unethicalreading and right after creating the MyReader instance, I have to fool this reader into thinking that the original file wasn't encrypted by using the decryptOnPurpose() method.

This results in the file hello3.pdf which is a file that is no longer encrypted with an owner password. This example can even be used to remove all passwords from a file that is encrypted with a user and an owner password as long as you have the user password.

I'll conclude this answer with a comment in answer to your remark about Aspose not being free of charge. You know that iText is free software, but you should also know that free isn't a synonym of for free. Please read my answer to the following question for more info: Is iText Java library free of charge or have any fees to be paid?

Community
  • 1
  • 1
Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • First of all, thank you for taking the time to write this lengthy response. Based on what you've said here, I think I may be confused about the function of the owner vs. the user password. – Daniel Pratt Jan 10 '15 at 18:53
  • On the other hand, it seems you also have made some grossly incorrect assumptions. For one, I did not say the referenced change was "recent", I said it was "more recent", as in after the answer was posted. For another thing, I know very well that iText is not free. We are already using it, thus we have a license. – Daniel Pratt Jan 10 '15 at 18:55
  • I am more or less the "Dr. House" at iText Software in the sense that the support team only passes me customer questions that nobody else can solve. I don't know if you posted the question to the issue tracker, but if you post the question publicly, you address the "Dr. House" inside me directly, which means you get an answer plus the attitude ;-) I now realize that my answer nor Kurt's answer match the actual problem. I appreciate the edit you've made. I've never tried opening a PDF with iText with the user password. It's very interesting to know that this used to work (it isn't supposed to). – Bruno Lowagie Jan 11 '15 at 06:47
  • I've tested with the user password and learned something I didn't know. I'll update my answer. – Bruno Lowagie Jan 11 '15 at 06:59
  • I've done updating. I'm considering to add the answer to ["The Best iText Questions on StackOverflow"](https://leanpub.com/itext_so) book. – Bruno Lowagie Jan 11 '15 at 07:17
  • @MaartenBodewes-owlstead I didn't read the question as "please recommend a library", but I wasn't surprised by the vote to close the question. I was more surprised by the comment *"If your goal was to annoy me, Mr. close-voter, congratulations, you succeeded."* The least you can say is that the question was open to different interpretations. I don't understand why there's suddenly a second close vote, as the question is now perfectly clear (see the very useful **Edit** that was made). I also see a downvote. I don't agree with that vote: the question reports a genuine issue that can be solved. – Bruno Lowagie Jan 11 '15 at 12:41
  • @MaartenBodewes-owlstead Fair enough. – Bruno Lowagie Jan 11 '15 at 13:26
  • 1
    @MaartenBodewes-owlstead: I take the liberty to conclude from your profile and distribution of votes you achieve for your activities on SO that you are an excellent expert when it comes to cryptography. However, for PDF you do not seem to have much merits. So why do you dare to deem a question *still* as off topic after two other people *with* some PDF merits provided answers which the OP stated to be useful, one way or another? – Kurt Pfeifle Jan 12 '15 at 00:27
  • @KurtPfeifle I've removed all my involvement. I can see that the OP seems to have tried at least something, and that you are both trying to answer the question. But this comes very close to asking sample code / libraries, both of which are off topic. I don't see why close votes have to come under such scrutiny. – Maarten Bodewes Jan 12 '15 at 00:53
  • @BrunoLowagie thanks for a detailed comment it has helped me a lot to resolve my issue – Bartosz Jun 12 '17 at 20:10
3

You can do it with the command line tool qpdf:

qpdf –-password=s3cr3t –-decrypt protected.pdf unprotected.pdf

qpdf also provides an API to be used from other programs.

Alternatively, you can also use the command line tool pdftk:

pdftk protected.pdf input_pw s3cr3t output unprotected.pdf
Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
  • Thanks, Kurt, I think I can make that work. Unless I get a better offer, I'll accept this answer :). – Daniel Pratt Jan 10 '15 at 01:12
  • @KurtPfeifle: I hope you agree that my answer is more on-topic than yours ;-) I think I'll even add it to [The Best iText Questions on StackOverflow](https://leanpub.com/itext_so) – Bruno Lowagie Jan 10 '15 at 14:30
  • @BrunoLowagie: Of course I agree :-) -- I just don't agree that mine was completely *off-topic*, given the OP asked for a solution *'preferably with itextsharp'* (doesn't rule out other solutions), and indicating as scope of the problem only *'some PDF documents'*. – Kurt Pfeifle Jan 10 '15 at 14:49
  • The question wasn't phrased very well. The OP said he wanted to decrypt an encrypted PDF *with* an owner password, but he meant to say *without* a password. He talked about a *recent* change referring to a change more than 2 years ago,... – Bruno Lowagie Jan 10 '15 at 14:53
  • Well, that is your interpretation (which may be correct). My answer assumed otherwise, and it only works for cases were the user ***IS*** in possession of the owner password... – Kurt Pfeifle Jan 10 '15 at 14:58