1

I am migrating our Pdf Library using PdfBox(2.0.16)(Previously it was in iText). I am trying to remove the OCG's by following the How to delete an optional content group alongwith its content from pdf using pdfbox? but I'm not able to do it.

The pdf file is generated by iText.

int ocgToDelete = -1;
List < Object > objects = Lists.newArrayList();

pdPage = doc.getDocumentCatalog().getPages().get(0);
PDResources resources = pdPage.getResources();
PDFStreamParser parser = new PDFStreamParser(pdPage);
parser.parse();
objects = parser.getTokens();
int index = -1;
/* Looping through Tokens */
for (Object obj: objects) {
    index++;
    if (obj instanceof COSName) {
        PDPropertyList prop = resources
            .getProperties((COSName) obj);
        if (prop != null &&
            prop instanceof PDOptionalContentGroup) {
            String ocgName = ((PDOptionalContentGroup) prop)
                .getName();
            if (StringUtils.equals(ocgName, OCName)) {
                /*Found OCG to Delete */
                ocgToDelete = index;
                break;
            }

        }
    }
}

/*Generating New Tokens */
List < Object > newTokens = Lists.newArrayList();
for (int i = 0; i < pdParser.getTokens().size(); i++) {
    /*Skipping this token */
    if (i == ocgToDelete) {
        continue;
    }
    newTokens.add(pdParser.getTokens().get(i));
}
/*Updating page contents */
ByteArrayOutputStream pdfOutStream = new ByteArrayOutputStream();
PDStream newContents = new PDStream(doc);
OutputStream output = newContents
    .createOutputStream();
ContentStreamWriter writer = new ContentStreamWriter(output);
writer.writeTokens(newTokens);
output.close();
pdPage.setContents(newContents);
doc.save(pdfOutStream);
doc.close();

FileOutputStream out = new FileOutputStream(
    new File("test/emitted/OCGResult.pdf"));
out.write(pdfOutStream.toByteArray());
out.close();

Getting Error on opening the resultant Pdf file. Any kind of help is greatly appreciated.

gopi
  • 21
  • 6
  • Please share the PDF. – Tilman Hausherr Jul 10 '19 at 08:45
  • @Tilman, Unfortunately Pdf file is confidential. I made some progress. I am able to find the OCG which I want to remove, tried to remove that token and rewriting the updated contents to page. but getting error on opening the resultant Pdf file. Any kind of help is greatly appreciated – gopi Jul 10 '19 at 16:51
  • "but getting error on opening the resultant Pdf file" - what error? – Tilman Hausherr Jul 10 '19 at 17:23
  • "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem". getting this error with Adobe Acrobat Reader – gopi Jul 10 '19 at 18:33
  • OK, please try to reproduce this with the file I mentioned in the other issue, and upload the result somewhere. However I see one huge problem: "if (i == ocgToDelete)". This makes no sense. Please look at the content stream - the idea is to delete everything between the start and the end marker (I think usually BDC and EMC but there could be others, and it could be nested). You should really look at it with PDFDebugger. This is some pretty advanced stuff. – Tilman Hausherr Jul 10 '19 at 18:49

0 Answers0