1

I have a existing PDF file and with iTextSharp I want to test if it is PDF/A compliant.

I don't want convert or create a file, just read and check if it is a PDF/A.

I have not tried anything because I did not find any methods or properties of the class PdfReader of iTextSharp, saying that the PDF is PDF/A. For now it would be enough to know how to verify that the document claims to be PDF/A compatible

Thanks Antonio

Joris Schellekens
  • 8,483
  • 2
  • 23
  • 54
Antonio
  • 21
  • 1
  • 4
  • Please describe what you have currently tried. – Sled Dec 20 '12 at 17:45
  • It is fairly easy to check whether the document *claims* to be PDF/An-m compliant using any PDF library including iText and iTextSharp. It is way more difficult to check whether the document actually *is* PDF/An-m compliant, and iText(Sharp) does not (yet) have convenience methods for that test. Which variant do you need? – mkl Dec 21 '12 at 00:26
  • 2
    iText(Sharp) doesn't do preflighting. What you're asking for is beyond the current scope of iText(Sharp) and it's not on our Technical Roadmap either. – Bruno Lowagie Dec 22 '12 at 13:51
  • I have not tried anything because i did not find any methods or properties of the class PdfReader of iTextSharp, saying that the pdf is pdf/a. For now it would be enough to know how to verify that the document claims to be PDF/A compatible. – Antonio Dec 28 '12 at 08:59

1 Answers1

1

After a long search i tried this way and seems to work:

    Dim reader As iTextSharp.text.pdf.PdfReader = New iTextSharp.text.pdf.PdfReader(sFilePdf)
    Dim yMetadata As Byte() = reader.Metadata()
    Dim bPDFA As Boolean = False

    If Not yMetadata Is Nothing Then
        Dim sXmlMetadata = System.Text.ASCIIEncoding.Default.GetString(yMetadata)

        Dim xmlDoc As Xml.XmlDocument = New Xml.XmlDocument()
        xmlDoc.LoadXml(sXmlMetadata)
        Dim nodes As Xml.XmlNodeList = xmlDoc.GetElementsByTagName("pdfaid:conformance")
        If nodes.Item(0).FirstChild.Value.ToUpper = "A" Then
            bPDFA = True
        End If
    End If

    Return bPDFA

I also found some reference to the class XmpReader, but not sufficient to do what I wanted

Antonio
  • 21
  • 1
  • 4
  • 1
    You are aware that you only test whether a document **claims to be** PDF/A compliant while you originally asked for a way to test whether it actually **is** compliant, aren't you? Furthermore your XML parsing looks quite optimistic concerning XML generation practices. – mkl Dec 29 '12 at 00:50
  • Yes in the title i wrote "compliant" but i also wrote in the message: "For now it would be enough to know how to verify that the document claims to be PDF/A compatible". – Antonio Jan 21 '13 at 13:38
  • In your opinion what was the best way to parse XML? Tanks! – Antonio Jan 21 '13 at 13:40
  • 3
    By using GetElementsByTagName you collect all such elements, whereever they are located. Such finds probably weren't meant to refer to this document after all! And you assume there always is a conformance element (by calling `nodes.Item(0).FirstChild` without former tests). Furthermore the PDF/A specification indicates that **pdfaid:part** references the *PDF/A version identifier* and **pdfaid:conformance** the *PDF/A conformance level*, **A** or **B** in case of PDF/A-1. Your code, therefore, only recognizes the claims for PDF/A-1a, PDF/A-2a, etc. but not for e.g. PDF/A-1b. – mkl Jan 21 '13 at 15:30