-1

I've got data that customers are entering into a website form that I need to transfer to pdf form here http://www.revenue.ie/en/tax/it/forms/form11.pdf. Is there a way to automate this?

  • How many such forms would you need to fill out? – Max Wyss Nov 15 '15 at 17:41
  • You need to be more specific. Please explain at least which programming language you want to use. Which tool do you want to use? Asking us to recommend a tool is not allowed on SO, see [item 4 of "What topics can I ask about here?" in the FAQ](http://stackoverflow.com/help/on-topic). – Bruno Lowagie Nov 16 '15 at 08:53

1 Answers1

0

I have bad news, good news, bad news and finally some good news.

The bad news (part 1) is that the PDF document you refer to is not a form.

It looks like a form to humans, but it's not a form when a machine looks at it:

enter image description here

The above screen shot is a view inside the PDF using iText RUPS. RUPS takes a look at the root dictionary of the PDF (that's where a PDF viewer starts its search for objects in the file after reading the trailer and the cross-reference table).

To a machine, the rectangles and the labels preceding these rectangles are nothing but a bunch of arbitrary lines and characters drawn on a page.

The good news (part 1) is that you can add content to those pages.

You can do this the hard way by taking a ruler and measuring the position of every "field" and by adding content at those absolute positions. That's trivial nor efficient. Should the form change (for instance: the IRS moves things around in the next fiscal year), then you have to start coding all over again.

You can also do this in a better way: just turn the "flat" PDF into a "smart" PDF by making it an interactive form.

I've opened the document in Acrobat and I asked Acrobat to Create a form automatically:

enter image description here

enter image description here

enter image description here

You then let Acrobat do its magic (it can take a while) and this will result in a file that looks like this one: form11.pdf.

The result is rather bad because I've asked a machine to do the work:

enter image description here

As you can see, plenty of fields have names such as "undefined_x", Acrobat thinks that the page number is a field, check box fields are treated as text fields, fields that expect amounts are seen as different fields because of the , and . signs, and so on. Turning the flat PDF into a real form requires human intervention. I made one change manually (I'm not going to do your work in your place). There are two fields on page 2 asking you if you're single that, to the human eye, look like check boxes; one under 2. and one under 3.

I removed the text field under 2. and made it a check box named "manually_edited_single":

enter image description here

You can see that it is now indeed a check box with possible values Off and Yes. This is very different from what we see under 3:

enter image description here

This field is named "Single", and whoever is familiar with PDF clearly sees that the form field / widget annotation (*) shown in the screen shot above is not a check box.

(*) the field dictionary and annotation dictionary are joined in this case.

The bad news (part 2) is that "fixing" the form is a manual process.

It's going to take you an hour or so to join different fields that belong together, to replace text fields by check boxes (or radio fields) where necessary, to remove fields that aren't actually fields and add fields that were missed by Adobe Acrobat.

The good news (part 2) is that filling out the resulting form is a no-brainer.

That question has been answered many times before, see for instance: How to fill out a pdf file programatically?

Community
  • 1
  • 1
Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • I've added an extra screen shot to my answer. Under Tools, you should open Forms and click on Create. Then select "From Existing Document" and "Current Document" for Acrobat to start doing the work. – Bruno Lowagie Nov 16 '15 at 10:30
  • I don't have a 'Forms' option under tools, all I have is 'Select and Zoom', 'Analysis', and 'Customize toolbar'. – Eoin McQuinn Nov 16 '15 at 11:33
  • 1
    Maybe you are confusing Adobe Reader (available for free) with Adobe Acrobat (which you need to purchase). You say that you are using Acrobat, but maybe you're wrong. Maybe you're using Reader... – Bruno Lowagie Nov 16 '15 at 11:55
  • 1
    In that case, you need to upgrade to Acrobat, or hire someone who has Acrobat to do the work for you, or ask the Irish Government yo provide real PDF forms (I'd vote for the latter; the Irish Government should take a lead in this matter). – Bruno Lowagie Nov 16 '15 at 13:20
  • So... can this question be closed or do you expect an other answer? Your question was a *Yes or No* question, and the answer was *Yes, but...* I think my answer explained the *but...* in detail, but since you didn't accept the answer yet, you seem to disagree. – Bruno Lowagie Nov 16 '15 at 13:29
  • How do i get the data from the html forms to the pdf? – Eoin McQuinn Nov 16 '15 at 18:11
  • btw ive looked for an adobe acrobat linux alternative and there doesn't seem to be any – Eoin McQuinn Nov 16 '15 at 18:37
  • Please don't abuse the comments to ask additional questions. You are responsible to create the HTML form. If you're smart you define the HTML form in such a way that the field names in the HTML form correspond with the field names you defined in the PDF form. Retrieving the field values on the server and putting them in the PDF is trivial. Any developer can do this. As for an alternative to Acrobat: you're right, you'll have to invest in an Acrobat license. – Bruno Lowagie Nov 16 '15 at 19:04
  • well seeing as my question has not been answered we cannot close this question. My question is: How to complete pdf from website for data? and you have not answered this question bruno – Eoin McQuinn Nov 16 '15 at 19:46
  • I don't agree, but who am I? I only have a 30K+ reputation on SO and you've posted a question that doesn't even qualify as a question if you read the SO FAQ... Too bad you just shut the door on the only person who was willing to help you... – Bruno Lowagie Nov 16 '15 at 20:02
  • I havent shut the door on anyone. My question is clear and straightforward. – Eoin McQuinn Nov 16 '15 at 20:43
  • What is most clear about your question is the fact that it's off-topic with respect to StackOverflow. It's also clear that you don't value the expertise of the only person who bothered to answer your question. Case closed. Have a nice day. – Bruno Lowagie Nov 16 '15 at 23:39