2

I am utilizing WebBrowser in multiple threads, however, after some executions (from 50 to 10000+) I get Access Violation Exception.

The related parts of the code:

Starting thread:

    var thread = new Thread(() =>
    {
        ProcessingThread();
    });
    thread.SetApartmentState(ApartmentState.STA);
    thread.Start();

Processing Thread:

void ProcessingThread()
{
    WebBrowser webBrowser = new WebBrowser();
    webBrowser.ScriptErrorsSuppressed = true;
    while (!Shutdown)
    {
        string htmlstring = GetHTMLString();
        webBrowser.DocumentText = htmlstring;
        webBrowser.Document.OpenNew(true);
        webBrowser.Document.Write(htmlstring);
        webBrowser.Refresh();
        HtmlDocument doc = webBrowser.Document;

        //Do Work
    }
}

There are usually from 2 to 8 such threads running at the same time.

I constantly get Access Violation on

webBrowser.Document.OpenNew(true);

I have read many similar questions but could not find the solution to the issue of mine.

I want to figure out what is the cause and the solution for the exception.

As of now, I am using WinForms and Visual Studio 2015 Update 3.

Exception string:

Exception thrown: 'System.AccessViolationException' in System.Windows.Forms.dll

Other thing I noted is that the higher version of .NET I use the less times the thread above manages to execute before throwing the exception.

For example, from ten times I have tried running it, it executes from 1000 to 10000 (absolute maximum of all tests I made, usually the maximum is around 5000) times on .NET 4.5 and from 70 to 1500 times on .NET 4.6.1.

I have tried:

  • Using native code debugging option, but it still refers to the same line.
  • Changing Platform Target without any noticeable result. Currently, it is x86.
  • Turning code optimizations on and off.
  • Running without debugger.
  • Changing target framework to no result. Currently, it is .NET 4.5.2.
  • Running/Debugging application on another machine.

I also know about WebRequest, WebClient and additionaly about HMTLAgilityPack, but I am using WebBrowser for its Javascript support.

Murumuru
  • 33
  • 6
  • Stupid question : why do you need 2-8 browsers in separate threads? I'm genuinely curious. – Noémie Lord Feb 16 '17 at 21:28
  • @FrancisLord, parsing HTML. Some of them generate necessary data using JavaScript and using WebBrowser is the only way I can think of to get the data. Of course, I could try using different browser (awesomium, for example), but I do not really want to use 3rd party tools unless necessary in this task, plus I really want to know what is wrong with the current access violation exception. It should also be possible to put multiple WebBrowser on form and use them - but I think it might be worse than what I am doing now. – Murumuru Feb 16 '17 at 21:40
  • Have you tried setting your thread as MTA instead of STA? – Noémie Lord Feb 16 '17 at 21:42
  • From what I quickly searched online, this should basically be either from you breaking some COM contracts of the threads, or from an Access Violation in unmanaged code (which you can't do anything about) – Noémie Lord Feb 16 '17 at 21:50
  • you may then wanna look at what is said here, this might help you a bit : http://stackoverflow.com/a/4156000/4064630 . TLDR : setting a thread as STA means you have to respect some rules set by COM+ otherwise, I believe an AccessViolation is what you get. That or WebBrowser is throwing an error in unmanaged code, which bubbles up as an AccessViolation in .net – Noémie Lord Feb 16 '17 at 21:58
  • @FrancisLord, WebBrowser is ActiveX control which requires STA, so I have to use STA. As for the latter part, it shouldn't be related to something with the rules as from examples I read and which work, I am doing pretty the same thing. I will do some more testing and research on message pump though. – Murumuru Feb 16 '17 at 22:08
  • that is a lot of code to duplicate the functionality of WebBrowser.DocumentText – Sheng Jiang 蒋晟 Feb 18 '17 at 05:54
  • @ShengJiang蒋晟, true, but I was using that mostly for testing purposes trying to figure out if doing one way or another would or would not cause the exception. I decided to put all the code in the question just in case. Or do you mean something else? – Murumuru Feb 18 '17 at 16:43
  • document.write is known to cause crashes in IE for certain HTML. Better use the built-in interface to make sure this isn't caused by document.write. Also for using STA components you have to run a message pump. – Sheng Jiang 蒋晟 Feb 19 '17 at 00:28
  • @ShengJiang蒋晟, I do have message pump, it does not seem to be the issue. It does not seem to be caused by Document.Write either, but I do not understand how I can test it. If you could link or explain it, I would be grateful. As of now I have found a passable way to make it work, however, I am still testing it. – Murumuru Feb 19 '17 at 16:45

1 Answers1

0

In my case the passable solution is to handle the Access Violation Exception using System.Runtime.ExceptionServices.HandleProcessCorruptedStateExceptions attribute on the method.

After some testing I figured out that simply catching the exception and retrying the same method the exception got thrown on works as the execution continues fine without any unforeseen exceptions.

I have tested it on more 100000 executions, so I assume that it is a pass.

I am writing this as it works as a solution for me in this case and might be of use to someone else, however, I am not marking this as answered because I still do not understand why the exception arises and if there is a better way to handle or prevent this case.

Murumuru
  • 33
  • 6