This is a case of OCR gone wrong. I need to remove the hidden text from a PDF and I'm having a hard time figuring out how to do it.
The hidden text resides in an area always named /QuickPDFsomething which is under and /XObject dictionary that resides in the page's /Resources dictionary.
I have tried these two things and neither has worked so I'm clearly doing something wrong.
Option 1 - Kill obj - The PDF won't open in Acrobat and states, 'An error exists on this page. Acrobat may not display the page correctly' but it looks ok. Pitstop pukes with 'Critical parser failure: XObject resource missing'.
PdfReader.KillIndirect(obj);
oPdfFile.GetPdfReader().RemoveUnusedObjects();
var stamper = new PdfStamper(oPdfFile.GetPdfReader(), new FileStream(@"C:\temp.pdf", FileMode.Create));
stamper.Close();
Option 2 - CleanupProcessor - Throws an exception about 'A Graphics object cannot be created from an image that has an indexed pixel format'.
var stamper = new PdfStamper(oPdfFile.GetPdfReader(), new FileStream(@"C:\temp.pdf", FileMode.Create));
var cleanupLocations = new List<PdfCleanUpLocation>();
var pageRect = oPdfFile.GetPdfReader().GetCropBox(1);
cleanupLocations.Add(new PdfCleanUpLocation(1, pageRect));
PdfCleanUpProcessor cleaner = new PdfCleanUpProcessor(cleanupLocations, stamper);
cleaner.CleanUp();
stamper.Close();
I'd like to remove the /QuickPDF object (41 0 R, in this image) as well as remove it from the content stream that calls it with /QuickPDF Do.
Unfortunately I cannot provide the PDF.
Any tips on how to do this?