Itextsharp HTMLWorker.Parse error

Question

I have a problem with HTMLWorker.Parse From iTextSharp in a Windows Form program. Everytime when I excecute the code and it starts with the HTMLWorker.Parse, it gives the objectDisposedException. The exception says that it cannot access a closed file. But I checked many times and cannot find the file that's closed. Here is the code:

class HtmlToPdfConverter
 {
             private iTextSharp.text.Document doc = new iTextSharp.text.Document();

     public HtmlToPdfConverter()
     {
        this.doc.SetPageSize(PageSize.A4);

     }

     public string Run(string html, string pdfName)
     {
        try
        {
            using (doc)
            {
                StyleSheet styles = new StyleSheet();
                using (PdfWriter writer = PdfWriter.GetInstance(this.doc, new     FileStream(@"Z:\programs\" + pdfName + ".pdf", FileMode.Create)))
                {
                    this.doc.Open();
                    this.doc.OpenDocument();
                    this.doc.NewPage();
                    if (this.doc.IsOpen() == true)
                    {
                        StringReader reader = new StringReader(html);
                        //XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, reader);
                        this.doc.Add(new Paragraph(" "));
                        HTMLWorker worker = new HTMLWorker(this.doc);
                        worker.Open();
                        worker.StartDocument();
                        worker.NewPage();
                        worker.Parse(reader);
                        worker.SetStyleSheet(styles);

                        List<IElement> ie = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(reader, null);

                        foreach (IElement element in ie)
                        {
                            this.doc.Add((IElement)element);
                        }

                        worker.EndDocument();
                        worker.Close();
                    }
                }
            }
            return string.Empty;
        }
        catch (Exception ex)
        {
            return ex.Message;
        }

    }
 }

This is the exception:

System.ObjectDisposedException was caught
  Message=Cannot access a closed file.
  Source=mscorlib
  ObjectName=""
  StackTrace:
       at System.IO.__Error.FileNotOpen()
       at System.IO.FileStream.Write(Byte[] array, Int32 offset, Int32 count)
       at iTextSharp.text.pdf.OutputStreamCounter.Write(Byte[] buffer, Int32 offset, Int32 count)
       at iTextSharp.text.pdf.PdfIndirectObject.WriteTo(Stream os)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber, Boolean inObjStm)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, PdfIndirectReference refa)
       at iTextSharp.text.pdf.PdfWriter.AddToBody(PdfObject objecta, PdfIndirectReference refa)
       at iTextSharp.text.pdf.Type1Font.WriteFont(PdfWriter writer, PdfIndirectReference piref, Object[] parms)
       at iTextSharp.text.pdf.FontDetails.WriteFont(PdfWriter writer)
       at iTextSharp.text.pdf.PdfWriter.AddSharedObjectsToBody()
       at iTextSharp.text.pdf.PdfWriter.Close()
       at iTextSharp.text.DocWriter.Dispose()
       at WebPageExtraction.HtmlToPdfConverter.Run(String html, String pdfName)
  InnerException:

Better switch to iText7 and refer this https://stackoverflow.com/a/57251780/14784590 — Reejesh, Jul 04 '23 at 07:04

Shadow The GPT Wizard · Accepted Answer · 2012-09-03T11:58:19.470

You are trying to call the close methods after it's already disposed.

You have a using block which is disposing the object automatically, so just remove those two lines:

doc.CloseDocument();
doc.Close();

If you don't trust the internal dispose code to properly close the document and want to do that yourself anyway, do it inside the using block:

using (doc)
{
    StyleSheet styles = new StyleSheet();
    using (PdfWriter writer = PdfWriter.GetInstance(this.doc, new     FileStream(@"Z:\programs\" + pdfName + ".pdf", FileMode.Create)))
    {
        //.....
    }
    doc.CloseDocument();
    doc.Close();
}

Edit: after trying your code for myself I noticed some more problems and found the real reason for the error you got:

You are closing and disposing the global object doc and never creating new instance.
You don't dispose of all objects, which might lead to memory leak or locked file.
The error you got was because by default, the PdfWriter is closing the Stream it's using and when disposed, the writer is trying to use this stream. So to solve this, you have to close the stream yourself and tell the writer to not do it.

Complete fixed code:

Document doc = new Document();
StyleSheet styles = new StyleSheet();
string filePath = @"Z:\programs\" + pdfName + ".pdf";
using (FileStream pdfStream = new FileStream(filePath, FileMode.Create))
{
    using (PdfWriter writer = PdfWriter.GetInstance(doc, pdfStream))
    {
        writer.CloseStream = false;
        doc.Open();
        doc.OpenDocument();
        doc.NewPage();
        if (doc.IsOpen() == true)
        {
            using (StringReader reader = new StringReader(html))
            {
                //XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, reader);
                doc.Add(new Paragraph(" "));
                using (HTMLWorker worker = new HTMLWorker(doc))
                {
                    worker.Open();
                    worker.StartDocument();
                    worker.NewPage();
                    worker.Parse(reader);
                    worker.SetStyleSheet(styles);
                    List<IElement> ie = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(reader, null);
                    foreach (IElement element in ie)
                    {
                        doc.Add((IElement)element);
                    }
                    worker.EndDocument();
                    worker.Close();
                }
            }
        }
        writer.Close();
    }
}

doc.CloseDocument();
doc.Close();
doc.Dispose();

i added those doc.close and .closeDocument as extra to look if that was going to work. I have tried your solution, but it still doesn't work. Thank you for helping. — Emon, Sep 03 '12 at 11:04
Yes, found the real reason. See my edit. The critical change is adding `writer.CloseStream = false;` — Shadow The GPT Wizard, Sep 03 '12 at 12:00
Now it gives a other exception. it is the webexception. it says that it cannot find the networkpath. This version also stops at worker.parse, do you know if there's something wrong with that method in iTextSharp? It doesn't give the other exception anymore. Thank you for helping me. — Emon, Sep 03 '12 at 12:29
Maybe `pdfName` is empty? Try hardcoding a path e.g. `@"Z:\programs\myfile.pdf"` and see if it work. — Shadow The GPT Wizard, Sep 03 '12 at 12:33
Sorry, no more ideas - try different path then e.g. `C:\Temp\myfile.pdf` — Shadow The GPT Wizard, Sep 03 '12 at 12:37
have tried that, still thanks for trying to help me. i will look for an other way to convert html to pdf in a windows form application. — Emon, Sep 03 '12 at 12:40
Well, it works for me so try asking about this new error in new question, post the new stack trace and everything. — Shadow The GPT Wizard, Sep 03 '12 at 12:50
Wait... if you say it happens in the `.Parse()` it means something in your HTML is wrong (not related at all to the PDF path). Can you post the HTML string as-is? If too long then add this into a new question and let me know. — Shadow The GPT Wizard, Sep 03 '12 at 12:51

Itextsharp HTMLWorker.Parse error

1 Answers1