3

I am running into a problem where I am using IText 7 to check a PDF that a user has downloaded off the internet.

For my test case I created a text file with garbage in it and saved it as a pdf. I know its not valid.

In the code I am trying to open the PDF using PDFReader.

An exception is being thrown, this is expected. When debugging the code the Reader object is null when it gets to the finally spot. So the reader.close() isn't even firing. I am even copying the file to a temp directory just to ensure nothing else is holding the file.

I am then unable to delete the PDF file either in code or manually in a file explorer after the exception. Here is some of my code. I removed everything but the Reader part. Also this code is after I have tried a few things, so you are seeing my attempt with the file being copied to a temp file. I am attempted to delete the temp file in the finally part. That is failing on a corrupt file.

Here are both the exceptions that are thrown when attempting to validate a bad PDF. The first is from the PDFReader call.

2021-04-09 13:18:11,079 ERROR GUI.Form1 - PDF header not found.
iText.IO.IOException: PDF header not found. at
iText.IO.Source.PdfTokenizer.GetHeaderOffset() at
iText.Kernel.Pdf.PdfReader.GetOffsetTokeniser(IRandomAccessSource> byteSource) at
iText.Kernel.Pdf.PdfReader..ctor(String filename, ReaderProperties properties) at
iText.Kernel.Pdf.PdfReader..ctor(FileInfo file) at
GUI.Form1.validatePDF(FileInfo pdfFile, HashSet`1 tmpMd5s)

The Second is from the attempt to delete the temp file

2021-04-09 13:18:11,116 ERROR GUI.Form1 - The process cannot access the file
'C:\Users\ret63\AppData\Local\Temp\tmp27DE.tmp' because it is being used by another process.
System.IO.IOException: The process cannot access the file 'C:\Users\ret63\AppData\Local\Temp\tmp27DE.tmp' because it is being used by another process. at
System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileInfo.Delete() at
GUI.Form1.validatePDF(FileInfo pdfFile, HashSet`1 tmpMd5s)

PdfDocument pdfDoc = null;
PdfReader reader = null;

try
{
    using (reader = new PdfReader(testFile))
    {
        //pdfDoc = new PdfDocument(reader);
        //pdfDoc = new PdfDocument(new PdfReader(pdfFile.FullName));
        //Console.WriteLine("Number of Pages: " + pdfDoc.GetNumberOfPages());
        //pdfDoc.Close();
    }
}
catch(Exception ex)
{
    log.Error(ex.Message, ex);
    throw new Exception("Invalid PDF File: " + pdfFile.Name);
}
finally
{
    if (reader != null)
    {
        reader.Close();
    }
    if (pdfDoc != null && !pdfDoc.IsClosed())
    {
        pdfDoc.Close();
    }

    try
    {
        if (testFile.Exists)
        {
            testFile.Delete();
        }
    }
    catch (Exception ee)
    {
        Console.WriteLine(ee.Message);
    }
}
KyleMit
  • 30,350
  • 66
  • 462
  • 664
EricT
  • 33
  • 3
  • If `new PdfReader(testFile)` throws an exception, the assignment to `reader` will never happen, so that explains why it's null. You don't need to worry about that. If the `PdfReader` constructor opened any resources before throwing an exception, it's responsible for closing them. – Kevin Krumwiede Apr 09 '21 at 17:08
  • Ok, but why is the file in the reader locked if the PDFReader object is null? The reader call is what is locking the file, as its not locked if I do everything the same but call the PDFReader – EricT Apr 09 '21 at 17:11
  • That's the interesting part of your question. Can you include the error message from `ee`? – Kevin Krumwiede Apr 09 '21 at 17:13
  • I updated the original post with the exceptions. – EricT Apr 09 '21 at 17:28
  • **See Also**: [iTextSharp exception: PDF header signature not found](https://stackoverflow.com/q/10621936/1366033) – KyleMit Dec 30 '21 at 20:33

1 Answers1

3

Looks like an iText bug. If you trace out what gets called by the PdfReader constructor, you see that it creates a FileStream that is conditionally locked. The FileStream gets wrapped in a RandomAccessSource which is then wrapped in a PdfTokenizer in GetOffsetTokeniser. If GetHeaderOffset throws on line 1433, that tok local is never closed.

Kevin Krumwiede
  • 9,868
  • 4
  • 34
  • 82
  • 1
    This is what I was thinking as well. I am currently ignoring the problem and leaving the corrupted files in the temp folder. I will look into asking IText to check the bug. – EricT Apr 09 '21 at 18:19
  • @EricT You could use the constructor that takes a `Stream` and close it yourself. – Kevin Krumwiede Apr 09 '21 at 21:18
  • Using a stream worked.. Thanks for the suggestion. – EricT Apr 12 '21 at 17:00
  • This should be [fixed in _develop_](https://github.com/itext/itext7-dotnet/commit/f8f21889beda2ee83e3795980e323dd1cb975276#diff-72123e52f0c329bd4de5ab82a6f4ee9adbe15c13c3580425808f4fecfaf387fd). The fix will be included in the next release. – rhens May 06 '21 at 17:33