0

I am trying to render and attach a new page to an existing PDF:

PDDocument withNewPage = renderNewPage();
withNewPage.getPages().forEach(page -> {
  markNeedToBeUpdated( // https://stackoverflow.com/a/56038907/1189885
      page,
      page.getResources(),
      page.getResources().getCOSObject().getDictionary(COSName.XOBJECT)
  );

  withNewPage.removePage(page);
  original.addPage(page);
});

This code works, but I get a finalizer warning because I forgot to close withNewPage. I added withNewPage.close() after this snippet, but now when I try to export the original to bytes I get the following exception:

java.io.IOException: COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed?
    at org.apache.pdfbox.cos.COSStream.checkClosed(COSStream.java:83)
    at org.apache.pdfbox.cos.COSStream.createRawInputStream(COSStream.java:133)
    at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1288)
    at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:416)
    at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:570)
    at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObjects(COSWriter.java:496)
    at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:480)
    at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1182)
    at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:452)
    at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1455)
    at org.apache.pdfbox.pdmodel.PDDocument.saveIncremental(PDDocument.java:1421)

I've debugged the close call, and the new page itself does not appear in the list of items closed by withNewPage.close(), though some COSStream objects do.

Why does calling removePage not prevent the page (or its stream?) from being closed along with the document it no longer belongs to, and how can I get this merge accomplished cleanly?

I have seen this question regarding this error, but I expected my code to work because I am removing the page from the rendered document. Additionally, the documentation for importPage says specifically that it should not be applied to generated documents such as this one. I tried it anyway, and I still got the same exception.

chrylis -cautiouslyoptimistic-
  • 75,269
  • 21
  • 115
  • 152
  • Maybe the removed page had resources that are also used by other pages, e.g. fonts, images, etc. – Tilman Hausherr May 31 '23 at 08:59
  • The page is still connected to the parent document so you need to detach it before adding it to the original document. Call: `page.setDocument(null); withNewPage.removePage(page);` – Lonzak May 31 '23 at 10:59
  • @Lonzak PDPage doesn't have a setDocument() method – Tilman Hausherr May 31 '23 at 11:04
  • @TilmanHausherr You are right. Couldn't one could create a new page by copying the data over from the page: `PDPage pageNew = new PDPage(); pageNew.setContents(page.getContents()); pageNew.setResources(page.getResources()); pageNew.setAnnotations(page.getAnnotations()); pageNew.set...(); withNewPage.removePage(page); original.addPage(pageNew);` – Lonzak May 31 '23 at 13:06
  • 1
    You're still copying pointers to resources. Only cloning would solve this, but this would bloat the result files (I think we had this for a while) if several pages share the same resources. The best is not to overthink this, and close all files only when all is done and not early. – Tilman Hausherr May 31 '23 at 13:34
  • @TilmanHausherr That's what I'm doing now, but it makes it difficult-to-impractical to use a functional approach to composing the documents. – chrylis -cautiouslyoptimistic- May 31 '23 at 15:10

0 Answers0