0

I was working on httpwebrequest and was trying to search google get result and simulate click to desired link. Is that possible?

 string raw ="http://www.google.com/search?hl=en&q={0}&aq=f&oq=&aqi=n1g10";
string search = string.Format(raw, HttpUtility.UrlEncode(searchTerm));
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(search);
request.Proxy = prox;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.ASCII))
{
HtmlElementCollection html = reader.ReadToEnd();
browserA=reader.ReadToEnd();
this.Invoke(new EventHandler(IE1));
}
}
Afnan Bashir
  • 7,319
  • 20
  • 76
  • 138

2 Answers2

1

You could parse the page using http://htmlagilitypack.codeplex.com/ or http://www.justagile.com/linq-to-html.aspx (also you may use Regexps if needed in conjunction with this tools) to find elements you want to "Click" and then process HttpWebRequest with this new elements. It is calling http://en.wikipedia.org/wiki/Web_scraping.

Also you should remember that resource which you web scraping may ban your IP address if a lot of requests coming from your IP address, to avoid that you need to think about using list of proxy servers.

angularrocks.com
  • 26,767
  • 13
  • 87
  • 104
  • I didn't mean to advice to parse complete HTML page with Regex, but it is possible to use Regex f.e.g in conjunction with HtmlAgilityPack if it is needed and depending on situation. But anyway i did edit my answer for clarity. – angularrocks.com Jan 09 '11 at 15:59
1

A better option is to use one of google's APIs.

There is a list of all of them here: Google APIs

Here is another on codeplex: Google Dot Net

They have services that allow applications to use google freely. With most of these there are wsdl files you can use to "Add Web Reference" in Visual Studio.

Using Regex and HtmlAgility pack should only be used as a last resort when a website does not expose public services (I had to use it recently for something I'm writing to integrate to uTorrent and BtJunkie). Google obviously wants people to develop with their sites in these ways.

jonathanpeppers
  • 26,115
  • 21
  • 99
  • 182