-1

I'd like to remove watermark from pdf file. It is probably created by software developed by Acrobat.

The books belongs to me. It is available to anyone who has access to academic service called EBSCO. Many academic libraries have it; so my library. I downloaded the book and I want to print some part of it without annoying watermarks.

"ADBE_CompoundType" Editable watermarks (headers, footers, stamps) created by Acrobat Information taken from here.

I used PdfContentStreamEditor class for pdfbox created by mkl and published at SO as an answer to a question. I override one method. Here it is:

@Override
protected void write(final ContentStreamWriter contentStreamWriter,
    final Operator operator,
    final List < COSBase > operands) throws IOException {

    if (isWatermark(operator, operands)) {

        final COSName xObjectName = COSName.getPDFName("Fm0");
        final PDXObject fm0 = page.getResources().getXObject(xObjectName);
        if (fm0 != null) {
            final COSObject pieceInfo = fm0.getCOSObject()
                .getCOSObject(COSName.getPDFName("PieceInfo"));
            if (pieceInfo != null) {
                final COSBase adbeCompoundType = pieceInfo.getDictionaryObject(
                    COSName.getPDFName("ADBE_CompoundType"));
                if (adbeCompoundType != null) {
                    final COSBase privateKey = ((COSDictionary) adbeCompoundType)
                        .getDictionaryObject("Private");
                    if ("Watermark".equals(((COSName) privateKey).getName())) {
                        final PDResources resources = page.getResources();
                        resources.getCOSObject().removeItem(xObjectName);
                        page.getResources().getCOSObject().setNeedToBeUpdated(true);
                        return;
                    }
                }
            }
        }
    }
    super.write(contentStreamWriter, operator, operands);
}

And helper method:

private boolean isWatermark(final Operator operator,
    final List < COSBase > operands) {
    final String operatorString = operator.getName();
    return operatorString.equals("Do") &&
        operands.size() == 1 && ((COSName) operands.get(0)).getName().equals("Fm0");
}

The code seems to work fine - no watermark is shown on any page. However, I cannot get rid of of the object with watermark. I tried to remove it with the following lines of code, unfortunately the object is not removed.

final PDResources resources = page.getResources(); resources.getCOSObject().removeItem(xObjectName); page.getResources().getCOSObject().setNeedToBeUpdated(true);

Here's a screenshot from pdfdebugger with watermark object:

enter image description here

And here's the watermark text. I couldn't find out how to check whether a watermark object contains this text and I'd like to know how to do this.

enter image description here

And here's one page of the pdf file: link1 and link2

E_net4
  • 27,810
  • 13
  • 101
  • 139
menteith
  • 596
  • 14
  • 51
  • Zippyshare blocks my region. Can you use a different file sharing platform? – mkl Jan 05 '23 at 14:27
  • @mkl I've uploaded the file to another file sharing service. – menteith Jan 05 '23 at 20:50
  • @menteith both those PDF files appear to be completely empty to me. – Mavaddat Javid Jan 12 '23 at 15:55
  • @MavaddatJavid There is one blank page as I managed to remove watermark. Now I want to remove watermark object itself from pdf. It is still present despite not being shown on page. – menteith Jan 12 '23 at 17:38
  • What you could do is to create an uncompressed version with the WriteDecodedDoc tool, use notepad++ and blank-overwrite the part between BT and ET or just the TJ line (without changing the size!), save it, then reopen with Adobe Reader, and save it. – Tilman Hausherr Jan 13 '23 at 09:13

2 Answers2

1

You try to remove the XObject Fm0 from the resources like this:

final PDResources resources = page.getResources();
resources.getCOSObject().removeItem(xObjectName);

I.e. you fetch the COS (dictionary) object of the resources and try to remove the Fm0 (in xObjectName) entry.

If you look closely at your screenshot, though, you'll see that the Fm0 entry is not in the Resources dictionary directly. Instead there is a nested XObject dictionary entry in which in turn is the Fm0 entry.

Thus, the following should work:

final PDResources resources = page.getResources();
COSDictionary dict = (COSDictionary) (resources.getCOSObject().getDictionaryObject(COSName.XOBJECT));
dict.removeItem(xObjectName);

PDResources has some helper methods, so the following should also work:

page.getResources().put(xObjectName, (PDXObject)null);

You mention that the book belongs to you and you, therefore, are entitled to remove the watermark. That is not automatically the case. Depending on the laws (global and local) and the contracts applicable you may only have acquired the right to use the book in its current form, including the watermark. Please make sure you understand the restrictions under which you may use the book.

Also I wonder why you want to get rid of that XObject if the watermark does not show anymore and you merely wanted to change the file to print without the watermark...

mkl
  • 90,588
  • 15
  • 125
  • 265
  • *"Furthermore, I've noticed that pdfbox may sometimes create output files which qpdf treats as invalid"* - hhmmm, that might be a pdfbox bug which should be analyzed and fixed. – mkl Jan 13 '23 at 17:35
  • I will create a sample pdf as well as code changing this this and create a Jira ticket then. – menteith Jan 13 '23 at 17:48
0

Althought mkl has answered this question, I'd like to share a solution using iText library despite the fact I prefer pdfbox over iText as the former is provided free of charge. iText code is less verbose than that of pdfbox. This is because when the watermark object is removed it is automatically not shown on any page.

for (int i = 1; i <= document.getNumberOfPages(); i++) {
    final PdfPage page = document.getPage(i);
    final PdfDictionary xObject = page.getResources().getResource(PdfName.XObject);
    if (xObject != null) {
        final PdfStream fm0 = xObject.getAsStream(new PdfName("Fm0"));
        if (fm0 != null) {
            final PdfDictionary pieceInfo = fm0.getAsDictionary(new PdfName("PieceInfo"));
            if (pieceInfo != null) {
                final PdfDictionary adbeCompoundType = pieceInfo.getAsDictionary(
                    new PdfName("ADBE_CompoundType"));
                if (adbeCompoundType != null) {
                    final PdfName privateKey = adbeCompoundType.getAsName(PdfName.Private);
                    if (privateKey != null) {
                        if ("Watermark".equals(privateKey.getValue())) {
                            xObject.remove(new PdfName("Fm0"));
                        }
                    }
                }
            }
        }
    }
}
menteith
  • 596
  • 14
  • 51