0

Iam Unable to do this from past one week. I want to click on multiple links n multiple web pages using webBrowser in C# Following is the code please help me in this regard.

public void DoDelete()
{
    int count = 0;

    if (corruptList.Count > 0)
    {
        foreach (string listItem in corruptList)
        {
            var th = new Thread(() =>
            {
                try
                {
                    WebBrowser webBrowser = new WebBrowser();
                    webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBroswer_DocumentCompleted);
                    webBrowser.Navigate(listItem);
                    Thread.Sleep(100);
                    webBrowser.Dispose();
                }
                catch (Exception ex)
                {
                    throw ex;
                }
                this.Invoke(new MethodInvoker(delegate
                {
                    dataGridView_CorruptLinks.Rows[count].Cells[2].Value = "Deleted";
                }));
            });
            th.SetApartmentState(ApartmentState.STA);
            th.Start();
            Thread.Sleep(100);
        }
        count++;
    }
}


void webBroswer_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    try
    {
        WebBrowser webBrowser = sender as WebBrowser;
        HtmlElementCollection ec = webBrowser.Document.GetElementsByTagName("a");
        foreach (HtmlElement item in ec)
        {
            if (item.InnerHtml == "Delete this invalid field")
            {
                item.InvokeMember("Click");
                break;
            }
        }
    }
    catch (Exception exp)
    {
    }
}
Suman Banerjee
  • 1,923
  • 4
  • 24
  • 40

3 Answers3

1

Navigate is an asynchronous action and you're only giving it 1/10 of a second to complete before you call Dispose on the web browser object. Your navigation and clicks are probably taking longer than that to complete and so there is no web browser to act against... You're also "swallowing" all exceptions in the document complete handler. This is a very bad thing to do. You should at the very least be doing some debug logging there to help yourself diagnose the problem.

But, to keep the similar logic you should create a collection of web browsers at class level. Something like:

private List<WebBrowser> _myWebBrowsers;

Then add to this list in your loop but do not call Dispose. You should only dispose of the browser when you're done with it.

That should get you closer though there are a few other potential issues with your code. You're allocating a borser object and thread for every time through a loop. This could quickly become unwieldy. You should use a thread management mechanism to throttle this process.

Simplified class:

class WebRunner
{
    private List<string> _corruptList = new List<string>();
    private List<WebBrowser> _browsers = new List<WebBrowser>();

    public void Run()
    {
        _corruptList.Add("http://google.com");
        _corruptList.Add("http://yahoo.com");
        _corruptList.Add("http://bing.com");

        DoDelete();

        Console.ReadKey();
    }

    public void DoDelete()
    {
        if (_corruptList.Count < 1) return;

        int counter = 1;

        foreach (string listItem in _corruptList)
        {
            WebBrowser webBrowser = new WebBrowser();
            _browsers.Add(webBrowser);
            webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBroswer_DocumentCompleted);
            webBrowser.Navigated += new WebBrowserNavigatedEventHandler(webBrowser_Navigated);
            webBrowser.Navigate(listItem);
            if (counter % 10 == 0) Thread.Sleep(3000); // let app catch up every so often
            counter++;
        }
    }

    void webBrowser_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        Console.WriteLine("NAVIGATED: " + e.Url);
    }

    void webBroswer_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        Console.WriteLine("COMPLETED!");
        try
        {
            WebBrowser webBrowser = sender as WebBrowser;

            HtmlDocument doc = webBrowser.Document;
            var button = doc.Body.Document.GetElementById("button");
            button.InvokeMember("Click");

            _browsers.Remove(webBrowser);
        }
        catch (Exception exp)
        {
            Console.WriteLine(exp.StackTrace);
            MessageBox.Show(exp.Message);
        }
    }
}
Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • Many Many Thanks for the answer i have the separate class for logging, the above code is just to show my problem that webbrowser is not correctly calling event handler(webBroswer_DocumentCompleted) when bulk of pages(almost 6000) are requested. it works perfect when i open pages one by one without foreach. Hope u understand my scnerio – Muhammad Ali Sep 12 '11 at 12:53
  • Please help me in this regard – Muhammad Ali Sep 12 '11 at 16:07
  • Take a look at the class I added to my answer. It avoids creating threads (the web browser is extensively threaded, no need to add extra threads) and it also uses a throttling technique to keep too many actions from running at once. It's a much simpler implementation of your original concept which ought to be easy to update and use. – Paul Sasik Sep 12 '11 at 16:31
  • Actually i have added the above code in background worker in order to run form application smoothly. I have tried your code but again no Success, Iam sorry i guess its my fault that may be iam doing wrong or something – Muhammad Ali Sep 12 '11 at 16:54
  • The only challenge iam facing is that i am not able to debug the webBroswer_DocumentCompleted, my debugger wont enter into that method thats y iam so worried, what iam doing wrong ? :( – Muhammad Ali Sep 12 '11 at 16:56
  • It can't work for the same reason the OP cannot make it work. The events can only be raised when the main thread is pumping a message loop. Sleep() does not pump a message loop. – Hans Passant Sep 12 '11 at 16:57
  • I did'nt get you Hans what u r saying – Muhammad Ali Sep 12 '11 at 17:02
  • THat's right. I made the assumption that this was a WinForms app with a message pump etc. The class I provided will not work in a console app. What is your app's architecture Huhammad? – Paul Sasik Sep 12 '11 at 17:05
  • Its a simple uitlity that reads weblinks(6000) from a text file and using webbrowser it iterates to every link(wepage) and clicks on a delete link on that webpage. my Ulitlty includes a backgroundworker in order to not to hang mainform and rest of the things i guess i have told you – Muhammad Ali Sep 12 '11 at 17:10
  • if there is any thing else let me know i just want to iterate every page and click on the a certain link programmatically. This is my only requirement – Muhammad Ali Sep 12 '11 at 17:11
  • The very basic problem is that you're creating a new thread every 100 ms create a very large and complex object which could run a potentially lengthy process and you're not giving it much time to work. You need to control this process better. Slow it down. – Paul Sasik Sep 12 '11 at 17:15
  • I have increased the time to 5 sec but i dont able to debug the documentcompleted event – Muhammad Ali Sep 12 '11 at 17:18
  • Moreover it works for a single link no loop that it works perfect but when in loop it works unusual as i have discussed earlier i dont know y this is happening :( Any solution? – Muhammad Ali Sep 12 '11 at 17:24
  • You're not going to fix this just by increasing the sleep timer amount. Please take a close look at the class I posted in my answer and what is different about it such as a List<> or web browsers, no extra threads and the throttling every 10 iterations... Just make sure to run it out of a forms project... and you of course have to change the links array contents. – Paul Sasik Sep 12 '11 at 17:54
0

You can access the WebBrowser document content using the following (you are missing body and need to type document to dynamic).

dynamic doc = browser.Document;
var button = doc.body.document.getElementById("button");
button.Click();
TheCodeKing
  • 19,064
  • 3
  • 47
  • 70
  • This will simplify the code, and I suggest that the OP use the snippet, but there are other critical issues to resolve before the handler code ever gets called. – Paul Sasik Sep 12 '11 at 12:30
0

I found the solution very next day. Sorry for the late post by processing threads one by one by putting the statement after thread.sleep() if (th.ThreadState == ThreadState.Aborted || th.ThreadState == ThreadState.Stopped)