I need to use proxies to download a forum. The problem with my code is that it takes only 10% of my internet bandwidth. Also I have read that I need to use a single HttpClient
instance, but with multiple proxies I don't know how to do it. Changing MaxDegreeOfParallelism
doesn't change anything.
public static IAsyncEnumerable<IFetchResult> FetchInParallelAsync(
this IEnumerable<Url> urls, FetchContext context)
{
var fetchBlcock = new TransformBlock<Url, IFetchResult>(
transform: url => url.FetchAsync(context),
dataflowBlockOptions: new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 128
}
);
foreach(var url in urls)
fetchBlcock.Post(url);
fetchBlcock.Complete();
var result = fetchBlcock.ToAsyncEnumerable();
return result;
}
Every call to FetchAsync
will create or reuse a HttpClient
with a WebProxy
.
public static async Task<IFetchResult> FetchAsync(this Url url, FetchContext context)
{
var httpClient = context.ProxyPool.Rent();
var result = await url.FetchAsync(httpClient, context.Observer, context.Delay,
context.isReloadWithCookie);
context.ProxyPool.Return(httpClient);
return result;
}
public HttpClient Rent()
{
lock(_lockObject)
{
if (_uninitiliazedDatacenterProxiesAddresses.Count != 0)
{
var proxyAddress = _uninitiliazedDatacenterProxiesAddresses.Pop();
return proxyAddress.GetWebProxy(DataCenterProxiesCredentials).GetHttpClient();
}
return _proxiesQueue.Dequeue();
}
}
I am a novice at software developing, but the task of downloading using hundreds or thousands of proxies asynchronously looks like a trivial task that many should have been faced with and found a correct way to do it. So far I was unable to find any solutions to my problem on the internet. Any thoughts of how to achieve maximum download speed?