1

I'm having trouble with PDFBox. I have a blank page in PDF and I want to insert images into it. Because I also work with signed PDFs, all changes have to be saved as "saveIncremental".

When I insert only one image everything is fine (image has been inserted). When I try to insert another image in this PDF, it has not been inserted and when opened in Adobe Acrobat Reader it says "An error exists on this page. Adobe may not display the page correctly ...".

Weird thing - when PDF is not only blank page but e.g. blank page with image, everything is fine (first and also second image has been inserted correctly with saveIncremental).

Code of inserting and saving image:

PDImageXObject pdImage = PDImageXObject.createFromFile(tmpSig.getFileName(), doc);
PDPageContentStream contentStream = new PDPageContentStream(doc, tmpPage, PDPageContentStream.AppendMode.APPEND, true, true);
contentStream.drawImage(pdImage, finalX, (finalPageHeight - finalY - finalHeight), finalWidth, finalHeight);
contentStream.close();

// update before save
tmpPage.getCOSObject().setNeedToBeUpdated(true);
tmpPage.getResources().getCOSObject().setNeedToBeUpdated(true);
doc.getDocumentCatalog().getPages().getCOSObject().setNeedToBeUpdated(true);
doc.getDocumentCatalog().getCOSObject().setNeedToBeUpdated(true);

// save
doc.saveIncremental(new FileOutputStream(pdfFile));

All files available here

Using PDFBox version 2.0.7 but I also tried the newest (2.0.15) but it didn't help.

Thanks for all ideas!


EDIT: I tried to update XObject and Resources as this (added this code under comment "update before save"):

pdImage.getCOSObject().setNeedToBeUpdated(true);
PDResources pdResources = tmpPage.getResources();
for (COSName name : pdResources.getXObjectNames()) {
    pdResources.getXObject(name).getCOSObject().setNeedToBeUpdated(true);
}

Problem still remains, nothing changed...

  • You only show the code for adding a single image. How do you add two images? My assumption would be that the problem is that you don't mark the specific **XObject** resources dictionary as updated, merely the generic **Resources** dictionary. – mkl May 07 '19 at 15:56
  • Yes. Opening `blank-inserted-one-saveIncremental-OK-inserted-second-PROBLEM.pdf` with PDFDebugger shows a log message "Missing XObject: Im2". In 2.0.15, please take the time to read the javadoc of saveIncremental(), if you haven't done so. – Tilman Hausherr May 07 '19 at 16:04
  • @mkl I'm not inserting two images at the same time. I insert one image and save the document (using code above, everything is alright). Then I use this document and add another image inside it (using code above but this document is not OK, the second image is missing). – user11465050 May 07 '19 at 21:10
  • @TilmanHausherr I realized that the image is missing completely in PDF file but why?? Why when for the first time, first image is inserted correctly but when I try to add second image, it is missing? What am I doing wrong when I am actually using the same code? And why it works when the original PDF is not blank but has an image in it? – user11465050 May 07 '19 at 21:21
  • Probably because the first time, the image dictionary was new. Please try what mkl wrote. – Tilman Hausherr May 08 '19 at 03:59
  • @TilmanHausherr I updated the question but I'm not quite sure if that's what you meant... Anyway, problem still unsolved. – user11465050 May 08 '19 at 09:38
  • In your edit you mark each specific XObject as updated. That's not what I meant, please mark the dictionary containing all those specific objects, i.e. `tmpPage.getResources().getCOSObject().getDictionaryObject(COSName.XOBJECT).setNeedToBeUpdated(true)` – mkl May 08 '19 at 09:45
  • @mkl Your code is invalid for PDFBox-2.0.7 and also 2.0.15 - there is no such method "setNeedToBeUpdated" for that... I tried modification of your code as this: `tmpPage.getResources().getCOSObject().getCOSObject(COSName.XOBJECT).setNeedToBeUpdated(true);` but that collapses on "NullPointerException"... – user11465050 May 08 '19 at 09:57
  • You need to check for null and cast if it is a dictionary. – Tilman Hausherr May 08 '19 at 10:11
  • try `tmpPage.getResources().getCOSObject().getCOSDictionary(COSName.XOBJECT)`, then you only need to check that one for null. – Tilman Hausherr May 08 '19 at 10:15
  • Yes, instead of `getDictionaryObject` use `getCOSDictionary`. – mkl May 08 '19 at 10:39

1 Answers1

4

In addition to the dictionaries you already marked as updated

tmpPage.getCOSObject().setNeedToBeUpdated(true);
tmpPage.getResources().getCOSObject().setNeedToBeUpdated(true);
doc.getDocumentCatalog().getPages().getCOSObject().setNeedToBeUpdated(true);
doc.getDocumentCatalog().getCOSObject().setNeedToBeUpdated(true);

please also mark the XObject entry in the resources dictionary as updated:

tmpPage.getResources().getCOSObject().getCOSDictionary(COSName.XOBJECT).setNeedToBeUpdated(true);

You wonder why you didn't need to do so when adding the first image?

In the original PDF there is no XObject entry in the resources dictionary yet. Thus, it's generated anew and, therefore, implicitly marked updated.

You wonder why you didn't need to do so when adding to the file which already had images?

In that other file the XObject entry in the resources dictionary is a direct object, i.e. it is immediately contained in the resources dictionary.

4 0 obj
<<
  /Type /Page
  /Resources <<
    /ProcSets [/PDF /Text /ImageB /ImageC /ImageI]
    /ExtGState <</G3 5 0 R /gs2 6 0 R /gs3 7 0 R>>
    /XObject <</Im1 8 0 R /Im2 9 0 R>>
  >>
  /MediaBox [0 0 611.03998 864.95996]
  /Contents [10 0 R 11 0 R 12 0 R 13 0 R 14 0 R]
  /StructParents 0
  /Parent 2 0 R
>> 
endobj

Thus, whenever a new copy of the resources dictionary is written, implicitly a new copy of the XObject entry is written, too.

In the file in which PDFBox created the XObject entry in the resources dictionary, though, PDFBox created it as an indirect object, i.e. in the resources dictionary XObject only maps to a reference to an object number and in the object with that number the actual entry dictionary can be found.

2 0 obj
<<
  /Type /Page
  /Resources <<
    /ProcSets [/PDF /Text /ImageB /ImageC /ImageI]
    /ExtGState <</G3 3 0 R>>
    /XObject 7 0 R
  >>
  /MediaBox [0 0 611.03998 864.95996]
  /Contents [8 0 R 4 0 R 9 0 R]
  /StructParents 0
  /Parent 5 0 R
>>
endobj
7 0 obj
<<
  /Im1 10 0 R
>> 
endobj

So when a new copy of the resources dictionary is written, no implicit new copy of the XObject entry dictionary is written in this case.


As an aside, your current approach won't help you with your task

Because I also work with signed PDFs, all changes have to be saved as "saveIncremental".

Adding images to the page content is not an allowed change to a signed PDF, so Adobe Reader will still indicate your signature is invalid. For a summary of the allowed and disallowed changes after signing, have a look at this answer and documents referenced from it.

You should instead try adding images in annotations.

Community
  • 1
  • 1
mkl
  • 90,588
  • 15
  • 125
  • 265
  • Yes, that solved my problem, thank you very much!! This problem has been driving me crazy for couple of days... Also thanks for the whole explanation! – user11465050 May 08 '19 at 11:04