0

I'm trying to parse PDF forms on server side. I tried many node.js modules like pdf2json, hummus and node-pdftk and I'm able to get all the text fields but not if checkboxes are checked.

I've been testing with different files (like this one) and pdf2json always returns an empty string as every checkbox value while hummus gives true. With pdftk I get FieldValue and FieldStateOption fields and compare them as I read in this answer but neither the result seems to be correct.

Can anybody give me some advice, please?

virgiliogm
  • 946
  • 7
  • 15

1 Answers1

0

See if using the pdffiller package works. The generateFDFTemplate method should do the trick. As per the README:

var pdfFiller = require('pdffiller');

var sourcePDF = "test/test.pdf";

// Override the default field name regex. Default: /FieldName: ([^\n]*)/
var nameRegex = null;  

var FDF_data = pdfFiller.generateFDFTemplate( sourcePDF, nameRegex, function(err, fdfData) {
    if (err) throw err;
    console.log(fdfData);
});

will print out:

{
    "last_name" : "",
    "first_name" : "",
    "date" : "",
    "football" : "",
    "baseball" : "",
    "basketball" : "",
    "hockey" : "",
    "nascar" : ""
};

Hope this helps :)

samzmann
  • 2,286
  • 3
  • 20
  • 47