I'm working on a PDF, the main idea is to extract the pdf content including images,text as well as checkboxes, as far as the text and images I extract the text content and images but I can't able to extract the checkbox data. I have tried itextsharp and another open-source tool regarding this, unable to get the check-status ( like true or false ).
Asked
Active
Viewed 336 times
0
-
pdfbox is java based. The c# port is hopelessly outdated and not supported. Re itext, how about this? https://stackoverflow.com/questions/29647712/how-to-check-checkbox-on-itextpdf – Tilman Hausherr Sep 14 '22 at 08:51
-
you can use the method [GetValueAsString()](https://api.itextpdf.com/iText7/dotnet/7.2.3/classi_text_1_1_forms_1_1_fields_1_1_pdf_form_field.html#a36c164a7c1abedfdf567478737def985) to see what's assigned to the checkbox. Typically it will have "Yes" if it is checked (and "Off" or empty if unchecked). You can use [GetAppearanceStates()](https://api.itextpdf.com/iText7/dotnet/7.2.3/classi_text_1_1_forms_1_1_fields_1_1_pdf_form_field.html#adbd9a77b83f868b81affd09201167d71) to check the possible values. – André Lemos Sep 14 '22 at 11:21
1 Answers
0
my c# is rusty, but using the latest version of iText, it should be something like this:
PdfDocument doc = new PdfDocument(new PdfReader(@"c:\\temp\\form.pdf"));
PdfAcroForm form = PdfAcroForm.GetAcroForm(doc, false);
IDictionary<string, PdfFormField> fields = form.GetFormFields();
foreach (KeyValuePair<string, PdfFormField> entry in fields)
{
PdfFormField field = entry.Value;
if (field is PdfButtonFormField)
{
Console.WriteLine(entry.Key + " has " + field.GetValueAsString());
}
}
where GetValueAsString() will typically have "Yes" for checked or "Off" or empty for unchecked.

André Lemos
- 837
- 6
- 18
-
(this is clearly assuming you mean an acroform, and not a "paper form" or a flattened version of an acroform) – André Lemos Sep 14 '22 at 12:29