I'm trying to use iTextSharp to take a look at some PDFs and check them for irregularities before they are printed. Part of this is checking the images in the PDF for their dpi, transparency and such.
To do this, I loop through the pages and retrieve PdfObjects, which are cast to a PRStream. From this PRStream the PdfName.SUBTYPE is retrieved, and checked to see if this matches PdfName.IMAGE.
This seems logical, to check if the found objects are actually images, but I run into the issue where Subtype is empty, and what seems to be an image in a pdf (I have tested several PDFs of my own as well as PDFs found online) is not considered an image and thus ignored.
Am I using the library incorrectly?
Code snippet:
PdfObject pdfObject = pdfReader.GetPdfObject(i);
//get the object at the index i in the objects collection
if (pdfObject == null || !pdfObject.IsStream()) //object not found so continue
{
continue;
}
PRStream prStream = (PRStream) pdfObject; //cast object to stream
PdfObject type = prStream.Get(PdfName.SUBTYPE); //get the object type
//check if the object is the image type object
if (type != null && type.ToString().Equals(PdfName.IMAGE.ToString()))
//This if returns false when I expect true
EDIT: As requested, A Pdf that I have used In this case, there are several images on page 2, 4, 5, 6 and 8. However, with the code that I run, it only recocnises a single image on page 5. There are objects found on page 4 and 8, but the SUBTYPE of these objects is null.