Python library "PDFrw" writes to annotations that remains invisible until clicking the field

Question

I am following the instructions in this article for writing information to annotations in a PDF document.

The script in the aforementioned article does work. However, after the script is executed and the output file is opened, the fields remain invisible. When clicking on a annotation, the text added from the script appears. But subsequently when clicking elsewhere in the document, the text from the script disappears.

Is there some sort of flag that needs to be triggered, to inform the PDF reader that the fields have been filled?

EDIT:

The script given in the article is probably not really correct.

When reading the first annotation of the unedited PDF, I get the following:

{'/T': '(business_name_1)', '/AA': {'/F': (113, 0)}, '/MK': {}, '/F': '4', '/Rect': ['77.433', '639.425', '538.174', '663.305'], '/Type': '/Annot', '/FT': '/Tx', '/AP': {'/N': (12, 0)}, '/DA': '(/Helv 0 Tf 1 1 1 rg)', '/Subtype': '/Widget', '/TU': '([Business Name])', '/Q': '1', '/P': (11, 0)}

When manually filling in a field with a PDF reader and saving it, and subsequently read that PDF file, the '/V': attribute is added to the previous code, i.e. the the first annotation is the following code:

{'/V': '(Bostata)', '/T': '(business_name_1)', '/AA': {'/F': (113, 0)}, '/MK': {}, '/F': '4', '/Rect': ['77.433', '639.425', '538.174', '663.305'], '/Type': '/Annot', '/FT': '/Tx', '/AP': {'/N': (12, 0)}, '/DA': '(/Helv 0 Tf 1 1 1 rg)', '/Subtype': '/Widget', '/TU': '([Business Name])', '/Q': '1', '/P': (11, 0)}

However, after the script has added the value to the annotation, a whole buch of data is added as well (10k+ charachters, so I'm not gonna paste it here).

Can someone plese spot the error of the script given in this article

EDIT2:

Here I found a partial answer.

Altering the code to:

annotation.update( pdfrw.PdfDict(AP=data_dict[key], V=data_dict[key]) )

When I open the pdf with Adobe reader in google chrome, it works fine. And if I open the file with PDF-XChange, it works fine.

HOWEVER, when I open the PDF file in Adobe acrobate installed on my Windows 10 machine, I have the same problem that the field is empty.

score 4 · Accepted Answer · answered Feb 08 '20 at 04:20

4

You need to set the /NeedAppearances tag to True.

Check this out- https://github.com/pmaupin/pdfrw/issues/84#issuecomment-463493521

answered Feb 08 '20 at 04:20

Zoie

344
2
9

Thank you. This partially worked. Some of the fields are now visible, and some are not (In adobe Acrobat. With google chrome I can see all fields). Strange. However, It only seems to affect text fields. If a specify that a `checkbox =Yes/Off`, I only see the checked box when I open the PDF with Google Chrome. When I open the PDF with Adobe acrobat, no check-boxes are checked. – tomatoeshift Feb 10 '20 at 09:54
1

@tomatoeshift Did you try setting AS for the checkbox fields? Try - annotation.update(pdfrw.PdfDict(AS=pdfrw.PdfName('Yes'))) – Zoie Feb 10 '20 at 15:45
Yes, I have set AS – tomatoeshift Feb 10 '20 at 15:56
Oh I just now noticed the `pdfrw.PdfName` part. Now it totally works – tomatoeshift Apr 14 '20 at 12:46

score 3 · Answer 2 · answered May 28 '20 at 07:58

I thought I'd share an answer with a fully working code

import pdfrw

def write_fillable_pdf(input_pdf_path, output_pdf_path, data_dict):
    ANNOT_KEY = '/Annots'
    ANNOT_FIELD_KEY = '/T'          # name
    ANNOT_FORM_type = '/FT'         # Form type (e.g. text/button)
    ANNOT_FORM_button = '/Btn'      # ID for buttons, i.e. a checkbox
    ANNOT_FORM_text = '/Tx'         # ID for textbox
    SUBTYPE_KEY = '/Subtype'
    WIDGET_SUBTYPE_KEY = '/Widget' 
    try:
        template_pdf = pdfrw.PdfReader(input_pdf_path)
        i =0
        for Page in template_pdf.pages:
            if Page[ANNOT_KEY]:
                for annotation in Page[ANNOT_KEY]:
                    if annotation[ANNOT_FIELD_KEY] and annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY :
                        key = annotation[ANNOT_FIELD_KEY][1:-1] # Remove parentheses
                        if key in data_dict.keys():
                            i += 1
                            if annotation[ANNOT_FORM_type] == ANNOT_FORM_button:
                                annotation.update( pdfrw.PdfDict( V=pdfrw.PdfName(data_dict[key]) , AS=pdfrw.PdfName(data_dict[key]) ))
                            elif annotation[ANNOT_FORM_type] == ANNOT_FORM_text:
                                annotation.update( pdfrw.PdfDict( V=data_dict[key] , AP=data_dict[key] ) )    
        if i>0:
            template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
            pdfrw.PdfWriter().write(output_pdf_path, template_pdf)
            return True
    except Exception as ex:
        print(ex)
        return False

if __name__ == '__main__':
    data_dict = {
    'item_1' : 'soap',
    'ManufacturingNo': '98765',
    'ceck_T102': 'Yes',
    'ceck_T104': 'Off',
    'ceck_T001': 'Yes'
    }
    INVOICE_TEMPLATE_PATH = 'TemplateFile.pdf'
    INVOICE_OUTPUT_PATH = 'OutputFile.pdf'

    if write_fillable_pdf(INVOICE_TEMPLATE_PATH, INVOICE_OUTPUT_PATH, data_dict):
        print("Success")

Python library "PDFrw" writes to annotations that remains invisible until clicking the field

EDIT:

EDIT2:

2 Answers2