First of all I have read the similar questions and they don't give me a coherent explanation. I use BlockingCollection<WebClient> ClientQueue
to provide Webclients. I give them a handling function and start the async scraping :
// Create queue of WebClient instances
BlockingCollection<WebClient> ClientQueue = new BlockingCollection<WebClient>();
for (int i = 0; i < 10; i++)
{
ClientQueue.Add(new WebClient());
}
//Triggering Async Calls
foreach (var item in source)
{
var worker = ClientQueue.Take();
worker.DownloadStringCompleted += (sender, e) => HandleJson(sender, e, ClientQueue, item);
worker.DownloadStringAsync(uri);
}
public static void HandleJson(object sender, EventArgs e, BlockingCollection<WebClient> ClientQueue, string item)
{
var res = (DownloadStringCompletedEventArgs) e;
var jsonData = res.Result;
var worker = (WebClient) sender;
var root = JsonConvert.DeserializeObject<RootObject>(jsonData);
// Record the data
while (worker.IsBusy) Thread.Sleep(5); // wait for the webClient to be free
ClientQueue.Add(worker);
}
I get this error message:
WebClient does not support concurrent I/O operations.
Other threads:
Here answer suggest the the issue is to wait until
WebClient.IsBusy = false
but I am doing this before puting back the webclient in the queue. I don't understand why the client cannot perform a new request after making itselfIsBusy=false
https://stackoverflow.com/a/9765812/7111121Here it suggests to use recycle webclients to optimize the process https://stackoverflow.com/a/7474959/2132352
Here it suggests to instanciate a new WebClient (easy solution of course but I don't want something hiding the way the objects used works). It also suggest to cancel the operation but this has not helped.