0

I need to write a program that sends around 200 search queries in parallel to Bing.com as effeciently as possoble. How should I best implement it considering blocking thread, and server errors.

For example: send search http://www.bing.com/search?q=.net, but keyword is different for each serach.

Update Cuurently, I use HttpClient and Task classes to send request, wait/blocking and get result. I am wondering if your solution will be better.

Any ideas or links would be very much appreciated!

Update

As suggested, I should use its api key.

Pingpong
  • 7,681
  • 21
  • 83
  • 209
  • What are you trying to do with the search? This is probably pretty easily accomplished with the [Task Parallel Library](http://msdn.microsoft.com/en-us/library/dd460717.aspx) – Prescott Mar 09 '13 at 22:46
  • Yes, I am using Task class. I need to download all results. – Pingpong Mar 09 '13 at 22:48

1 Answers1

1

Search engines dont like being scraped for content. Its against their TOS and they aggressively block it.

Unless you have an agreement (and thus an api key) that allows this, its going to be hard.

The code would simply be async web requests or (easier but less efficient perhaps) sync webrequests in parallel.

However you would need access to a considerable number of proxies to avoid the inevitable IP ban. I would not suggest you try to do this

Steve
  • 20,703
  • 5
  • 41
  • 67
  • Thanks! I will change to use api key. In terms of programming logic, please see my update. – Pingpong Mar 09 '13 at 22:44
  • As mentioned, the simplest option would be webrequests in a parallel loop, see here for example of mine (vb.net, but very easy to translate to c#) http://stackoverflow.com/questions/13842201/how-to-throttle-concurrent-async-webrequests – Steve Mar 09 '13 at 22:52