3

I am using PDFBox to generate a bunch of invoices in a loop. This is working in general, but unfortunately I am getting the following exception from time to time in the loop. Starting the generation again once or twice for the failed invoices will create all of them sooner or later.

java.io.IOException: COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed?
at org.apache.pdfbox.cos.COSStream.checkClosed(COSStream.java:83)
at org.apache.pdfbox.cos.COSStream.createRawInputStream(COSStream.java:133)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1202)
at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:400)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:521)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObjects(COSWriter.java:459)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:443)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1096)
at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:417)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1369)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1256)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1279)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1250)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1238)
at de.xx.xxx.CreateLandscapePDF.createPdf(CreateLandscapePDF.java:37)
at de.xx.xxx.CreateInvoiceAsPDF.createPdf(CreateInvoiceAsPDF.java:172)
...

I have already looked into some similar questions like here PDFbox saying PDDocument closed when its not and I just think that it has something to do with freed objects by the garbage collector, but I do not see the fault in my code.

For the creation of the PDF itself I am using in general the description of Apache PDFBox Cookbook at https://pdfbox.apache.org/1.8/cookbook/documentcreation.html. I more or less only add more content, an image, some text blocks, a table and so on.

public class CreateLandscapePDF {

private ArrayList<ContentBlock> content;
private PDRectangle pageDIN;
private PDDocument doc;

public CreateLandscapePDF(ArrayList<ContentBlock> content, PDRectangle pageDIN) {
    this.content = content;
    this.pageDIN = pageDIN;
}

public void createPdf(String pdfFileName) throws IOException
{
    doc = new PDDocument();

    PDPage page = new PDPage(pageDIN);
    doc.addPage(page);
    PDPageContentStream contentStream = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.OVERWRITE, false);

    for (ContentBlock contentBlock : content) {
        contentBlock.getContentHelper().writeContentToPDF(contentStream);
        contentStream.moveTo(0, 0);
    }
    contentStream.close();
    doc.save( pdfFileName );
    doc.close();
}

}

In my creation process I have the loop in the CreateInvoiceAsPDF.createPdf method. In this loop I create always new objects of CreateLandscapePDF.

CreateLandscapePDF pdf = new CreateLandscapePDF(contentList, PDRectangle.A4);
pdf.createPdf(TEMP_FILEPATH_NAME + pdfFileName);

The writeContentToPDF method only places the several content like text, images and lines at a defined pixel unit into the page. As an example I put the code from my TextContentHelper:

    public void writeContentToPDF(PDPageContentStream contentStream) throws IOException {
    float maxTextWidth = 1;
    contentStream.beginText();
    float fontSize = content.getFontSize();
    PDFont font = content.getFont();
    contentStream.setFont(font, fontSize);
    contentStream.setLeading(content.getLineSpace() * fontSize);
    float xPos =0;
    for (Object text : content.getContent()) {
        if (text instanceof String) {
            float textWidth = UnitTranslator.getPixUnitFromTextLength(font, fontSize, (String) text);
            switch (content.getAlignment()) {
            case CENTER:
                xPos = 0.5f*(content.getXEndPosition()+content.getXPosition()-textWidth);
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            case RIGHT:
                xPos = content.getXEndPosition()-textWidth;
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            default:
                xPos = content.getXPosition();
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            }
            contentStream.showText((String) text);
            contentStream.newLine();
            contentStream.newLineAtOffset(-xPos, -content.getYPosition());
            if (textWidth > maxTextWidth) {
                maxTextWidth = textWidth;
            }
        }
    }
    contentStream.endText();
    if (content.isBorder()) {
        createTextBlockBorder(contentStream, maxTextWidth, fontSize);

    }
}

I appreciate any hint to solve this annoying problem!

Roland
  • 43
  • 3
  • The exception usually comes if you've closed the COSStream before, e.g. because it was part of another PDDocument. So I wonder what else is done in `writeContentToPDF`. Please do also make sure you're using the latest PDFBox version (2.0.13) and the latest java. That is 1.8.202 (or 201) or 11.0.2. – Tilman Hausherr Jan 30 '19 at 11:18
  • Tilman, thanks for your reply! At the moment I am using pdfbox 2.0.9, I tried already version 2.0.13, but saw no difference. With latest Java I have no chance, because my code is running in a Lotus Notes environment, which runs Java 1.6.0, I cannot update :-( The writeContentToPDF methods doesn't do magic. It just takes the content like text, images, lines and places it at a specific pixel unit. I will add an example from my TextContentHelper above. – Roland Jan 31 '19 at 07:56
  • Where does `content.getFont()` come from? Was the font object generated for THAT PDDocument? Or is it global for all PDFs or for a group of PDFs? (Which won't work) – Tilman Hausherr Jan 31 '19 at 10:17
  • Tip for debugging: look at the destination file with an editor like NOTEPAD++, you'll see an incomplete stream at the bottom. Post the last few lines, starting from the last line that has "number 0 obj". That will indicate what kind of COSStream is in trouble. – Tilman Hausherr Jan 31 '19 at 10:23
  • Also look at the log outputs... if my theory is correct, you'll find some warnings that you have unclosed documents. This could happen if you passed `new PDDocument` to a font creation method, and that PDDocument is of course unreferenced, so it would be closed automatically at some later time. – Tilman Hausherr Jan 31 '19 at 10:43
  • Ok, yes I create the PDFont only once as `private static final PDFont FONT = PDType1Font.HELVETICA;` in my `CreateInvoiceAsPDF` class, but for every PDF document I create a new instance. And yes, I get this warning `WARNING: Warning: You did not close a PDF Document` several times. But in my code I use `new PDDocument` only once and that place you see in my code snipped above. I also had a look into the erroneous PDF file. At the end I see the following lines: `Line 559: 6 0 obj Line 562: /Type /XObject`. What can I read from this? – Roland Jan 31 '19 at 14:47
  • Maybe it has something to do with the image I insert, because the last lines in the faulty PDF files are always the same: `/Length 56960 /Type /XObject /Subtype /Image /Filter /FlateDecode /BitsPerComponent 8 /Width 501 /Height 508 /ColorSpace /DeviceRGB /SMask 8 0 R` Is there something special to consider? – Roland Jan 31 '19 at 15:34
  • OK, so it's not the font, it's an image. How did you create this image? – Tilman Hausherr Jan 31 '19 at 16:39
  • First I read the image into an InputStream, then I convert it into a byte array and the last step is to create this object `PDImageXObject pdImage = PDImageXObject.createFromByteArray(new PDDocument(), byteArray, null);`. (And here I see another `new PDDocument`, uhh, I hate the code search in Lotus Notes, you cannot trust it...). Ok, this object will be placed later on with the `drawImage` method of `PDPageContentStream` into the PDF. The `PDImageXObject` object is only used in one PDF and will be re-created for every new PDF file. – Roland Feb 04 '19 at 08:03
  • So is it correct when I assume that I need to use my one and only PDDocument to create the `PDImageXObject` instead of using `new PDDocument` in the `createFromByteArray` method to solve my problem? – Roland Feb 04 '19 at 08:09
  • Yes! I'll write an answer. – Tilman Hausherr Feb 04 '19 at 08:12

1 Answers1

2

1) The COSStream has been closed and cannot be read exception when saving is best analysed by looking at the end of the partially saved file. Open it with NOTEPAD++, you'll see an incomplete stream at the bottom. Post the last few lines, starting from the last line that has "number 0 obj". That will indicate what kind of COSStream is in trouble.

2) Your file showed an image XObject ("/Type /XObject /Subtype /Image").

3) Further research showed that you created your image with

PDImageXObject pdImage = PDImageXObject.createFromByteArray(new PDDocument(), ...);

and you sporadically also got the warning Warning: You did not close a PDF Document.

This is because your new PDDocument() object is passed to the createFromByteArray method but isn't kept, PDFBox needs it only to get the memory management stuff of that PDDocument ("scratch file"). So later (garbage collection) this unreferenced PDDocument is finalized, and closes all related streams, which includes the image stream you created.

So the solution is to pass the PDDocument of your own document, not some temporary object.

4) Note that this also applies to fonts, so don't pass new PDDocument() to a font creation method. (not applicable to you, but maybe to people in the future).

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97