Flattening Form fields removes content

Question

I try to flatten form fields (PDAcroForm.flatten()) in a pdf which in the step before got filled from an .xfdf file. The expected result is to have the editable boxes replaced with just the text.

However I from the PDF where the text is filled in the form (output02.pdf) after flattening, all added text is now completely gone, so I get a blank spaces instead of the form values (output03.pdf).

Put a complete example on github, containing the PDF files (input and the generated output), but here is just the part of the flattening:

// in Main.java, function flatten()

PDDocument pdf_document = PDDocument.load(new File("output02.pdf"));  //from step before, merged & filled pdf files.

List<PDField> the_fields = new ArrayList<PDField>();
for (PDField field: pdf_document.getDocumentCatalog().getAcroForm().getFieldTree()) {
    the_fields.add(field);
}
System.out.println("Flattening fields: " + Arrays.stream(the_fields.toArray()).map(field -> ((PDField)field).getFullyQualifiedName()).collect(Collectors.joining(", ","[","]")));
pdf_document.getDocumentCatalog().getAcroForm().flatten(the_fields, true);
pdf_document.save(new File("output03.pdf"));

^{The text filled in is gone, too}

Edit:
Created those form elements with Adobe Acrobat Pro 10.1.1 on existing PDFs, via the form menu, and simply saved the pdfs as sample5.pdf and test.pdf.

One way would be to manually write the text from the form field to the PDF's content stream, with calculating the placement from the existing x,y,w,h box and font size ourself, but that would break with fields having the size set to `Auto`. — luckydonald, Mar 15 '19 at 15:36
There /V entries (field values) are names but should be strings. I hope that this isn't a PDFBox bug... — Tilman Hausherr, Mar 15 '19 at 17:35
@TilmanHausherr Created those form elements with `Adobe Acrobat Pro` `10.1.1` on an existing PDF, via the form menu, and simply saved the pdfs as `sample5.pdf` and `test.pdf`. — luckydonald, Mar 15 '19 at 17:38
In the meantime I ran a part of your code (output01.pdf with test.xfdf) and it worked fine. What PDFBox version did you use? The /V entries are the field values. — Tilman Hausherr, Mar 15 '19 at 17:58
`pdfbox` Version is `2.0.1`, using all the dependencies like specified in the `pom.xml` file. — luckydonald, Mar 15 '19 at 18:01
Please retry with the current version. Why use 2.0.1 which is several years old? — Tilman Hausherr, Mar 15 '19 at 18:01
You hit a bug that was fixed two years ago: https://issues.apache.org/jira/browse/PDFBOX-3596 — Tilman Hausherr, Mar 15 '19 at 18:05
I was under the impression to have the latest version. In fact that solved it. Wanna make an answer I can accept? — luckydonald, Mar 15 '19 at 18:14

score 2 · Accepted Answer · answered Mar 15 '19 at 18:31

2

This is a bug that was fixed since 2.0.5 two years ago. Due to that bug, the field values in the xfdf file were assigned as names instead of as strings in the /V entry (for the value) of the field dictionary. Because that, there is nothing to show in the appearance stream of the field. Thus nothing after flattening.

Always use the latest version of PDFBox. I use the maven versions plugin in all my projects.

answered Mar 15 '19 at 18:31

Tilman Hausherr

17,731
7
58
97

The maven versions plugin is really a great tool. Thanks! – luckydonald Mar 21 '19 at 15:35

Flattening Form fields removes content

1 Answers1