1

I'm able to fill a textbox annotation with the following code, but the text won't appear in certain readers like Adobe Acrobat, though it does appear in Chrome and other Webkit-based browsers. The PDFs I'm trying to fill do not use AcroForms or FDF. I'm using Apache PDFBox, but I don't believe there is much difference in PDF libraries, even across languages/platforms.

// edited for brevity
PDAnnotation annotation = doc.getPages().get(0).getAnnotations().get(0);
COSDictionary cosObject = annotation.getCOSObject();
cosObject.setString(COSName.V, content);

An example document is IRS form W-4.

What I've tried so far

I've tried comparing my PDF output against a document filled in Chrome, but the only difference I see is in the default appearance (DA) property. I've tried to set the default appearance text content like this, but to no avail:

COSString defaultAppearance = (COSString)cosObject.getItem(COSName.DA);
COSString newAppearance = new COSString(defaultAppearance.getString() + "0 0 Td (" + value + ") Tj");
cosObject.setItem(COSName.DA, newAppearance);

I've also messed around with a few flags that sounded promising:

int FLAG_PRINT = 4;
int FLAG_READ_ONLY = 64;
annotation.setAnnotationFlags(annotation.getAnnotationFlags() | FLAG_PRINT | FLAG_READ_ONLY);

I've also tried other properties:

cosObject.setString(COSName.CONTENTS, content);

I believe the relevant section in the PDF 1.7 spec is 12.7.4.3.


What am I missing?
Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
Andrew
  • 13
  • 4
  • 1
    Your form is a hybrid XFA/AcroForm form. I.e. it contains form descriptions both as AcroForm form PDF objects and as XFA XML streams. (XFA riding piggyback in PDFs have been deprecated in 2017 but some organizations appear to be a bit slow recognizing this, in particular government ones.) Viewers that support XFA (e.g. Adobe Reader) use the XFA description, others use the AcroForm description. PDFBox only manipulates the AcroForm description. Thus, you should remove the XFA description to have a pure AcroForm form everyone can handle identically (or nearly so). – mkl Aug 20 '20 at 18:30
  • Sounds good, do you know how to remove it with PDFBox? I didn't find anything obvious in a quick scan of the docs. – Andrew Aug 20 '20 at 20:46
  • You may be interested in [this answer](https://stackoverflow.com/a/24187719/1729265). – mkl Aug 20 '20 at 21:52

1 Answers1

1

Your PDF does use acroform fields. The widgets annotations are the visual representation of the field. What you want to do is to set the field. Here's the SetField.java example from the source code download. Call it with these parameters: filename, field name (the first name is "topmostSubform[0].Page1[0].Step1a[0].f1_01[0]") and a value.

To get the field names, download PDFDebugger and hover the mouse over the fields you like to set.

And here's how the field looks after being set:

enter image description here

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.pdfbox.examples.interactive.form;

import java.io.File;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
import org.apache.pdfbox.pdmodel.interactive.form.PDCheckBox;
import org.apache.pdfbox.pdmodel.interactive.form.PDComboBox;
import org.apache.pdfbox.pdmodel.interactive.form.PDField;
import org.apache.pdfbox.pdmodel.interactive.form.PDListBox;
import org.apache.pdfbox.pdmodel.interactive.form.PDRadioButton;
import org.apache.pdfbox.pdmodel.interactive.form.PDTextField;

/**
 * This example will take a PDF document and set a form field in it.
 *
 * @author Ben Litchfield
 *
 */
public class SetField
{
    /**
     * This will set a single field in the document.
     *
     * @param pdfDocument The PDF to set the field in.
     * @param name The name of the field to set.
     * @param value The new value of the field.
     *
     * @throws IOException If there is an error setting the field.
     */
    public void setField(PDDocument pdfDocument, String name, String value) throws IOException
    {
        PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        PDField field = acroForm.getField(name);
        if (field != null)
        {
            if (field instanceof PDCheckBox)
            {
                if (value.isEmpty())
                    ((PDCheckBox) field).unCheck();
                else
                    ((PDCheckBox) field).check();
            }
            else if (field instanceof PDComboBox)
            {
                field.setValue(value);
            }
            else if (field instanceof PDListBox)
            {
                field.setValue(value);
            }
            else if (field instanceof PDRadioButton)
            {
                field.setValue(value);
            }
            else if (field instanceof PDTextField)
            {
                field.setValue(value);
            } 
        }
        else
        {
            System.err.println("No field found with name:" + name);
        }
    }

    /**
     * This will read a PDF file and set a field and then write it the pdf out
     * again. <br>
     * see usage() for commandline
     *
     * @param args command line arguments
     *
     * @throws IOException If there is an error importing the FDF document.
     */
    public static void main(String[] args) throws IOException
    {
        SetField setter = new SetField();
        setter.setField(args);
    }

    private void setField(String[] args) throws IOException
    {
        PDDocument pdf = null;
        try
        {
            if (args.length != 3)
            {
                usage();
            }
            else
            {
                SetField example = new SetField();
                pdf = PDDocument.load(new File(args[0]));
                example.setField(pdf, args[1], args[2]);
                pdf.save(args[0]);
            }
        }
        finally
        {
            if (pdf != null)
            {
                pdf.close();
            }
        }
    }

    /**
     * This will print out a message telling how to use this example.
     */
    private static void usage()
    {
        System.err.println("usage: org.apache.pdfbox.examples.interactive.form.SetField <pdf-file> <field-name> <field-value>");
    }
}
Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
  • Excellent, thank you! Looks like I misunderstood the PDF contents. I was hoping this would solve another issue where the field values don't appear in mobile print previews, but this helps a lot and I think that's a separate question. Thanks again – Andrew Aug 20 '20 at 20:21
  • 1
    It may be worth noting that it appears checkbox values can vary wildly from "Yes". Depending on the form, values may be on a per-checkbox basis, ranging from "1" or "Y", to "R", "P", "BD", " APT ", "N", etc. – Andrew Aug 25 '20 at 15:00
  • Thank you. I've improved the answer and will change the example too. – Tilman Hausherr Aug 25 '20 at 15:46