2

I have console application which is doing multiple API requests over HTTPs.
When running in single thread it can do maximum of about 8 API requests / seconds.

Server which is receiving API calls has lots of free resources, so it should be able to handle many more than 8 / sec.

Also when I run multiple instances of the application, each instance is still able to do 8 requests / sec.

I tried following code to parallelize the requests, but it still runs synchronously:

var taskList = new List<Task<string>>();

for (int i = 0; i < 10000; i++)
{
    string threadNumber = i.ToString();
    Task<string> task = Task<string>.Factory.StartNew(() => apiRequest(requestData));
    taskList.Add(task);
}

foreach (var task in taskList)
{
    Console.WriteLine(task.Result);
}

What am I doing wrong here?

EDIT: My mistake was iterating over tasks and getting task.Result, that was blocking the main thread, making me think that it was running synchronously.

Code which I ended up using instead of foreach(var task in taskList):

while (taskList.Count > 0)
{
    Task.WaitAny();
    // Gets tasks in RanToCompletion or Faulted state
    var finishedTasks = GetFinishedTasks(taskList);
    foreach (Task<string> finishedTask in finishedTasks)
    {
        Console.WriteLine(finishedTask.Result);
        taskList.Remove(finishedTask);
    }
}
stkxchng
  • 65
  • 5
  • 26
  • 2
    You should be using be using `Task.Run()` instead of `Task.Factory.StartNew()`, but other than that, there's nothing wrong with your approach. We don't know what's inside `apiRequest` though. – dcastro Aug 27 '15 at 14:42
  • 3
    But you do understand that the foreach is waiting for each task to finish before moving on to the next one, right? – rbm Aug 27 '15 at 14:50
  • Should he put a `Task.WaitAll` in there? I'm not sure you are exactly correct on the foreach; it would eventually wait for all tasks to complete, but those that have completed would return immediately. – theMayer Aug 27 '15 at 14:54
  • See [Task.Result](https://msdn.microsoft.com/en-us/library/dd321468%28v=vs.110%29.aspx) - it waits for the task to complete on the calling thread. But this is for one task, not all tasks. And if they were truly being executed in parallel, your method should still take less time. – theMayer Aug 27 '15 at 14:56
  • Per your update, you should use Task.WaitAll(taskList); instead of what you're doing with the finishedTasks collection and whatnot. – Steven Evers Aug 28 '15 at 18:18

2 Answers2

3

There could be a couple of things going on. First, the .net ServicePoint class allows a maximum number of 2 connections per host by default. See this Stack Overflow question/answer.

Second, your server might theoretically be able to handle more than 8/sec, but there could be resource constraints or other issues preventing that on the server side. I have run into issues with API calls which theoretically should be able to handle much more than they do, but for whatever reason were designed or implemented improperly.

Community
  • 1
  • 1
theMayer
  • 15,456
  • 7
  • 58
  • 90
2

@theMayer is kinda-sorta correct. It's possible that your call to apiRequest is what's blocking and making the whole expression seem synchronous...

However... you're iterating over each task and calling task.Result, which will block until the task completes in order to print it to the screen. So, for example, all tasks except the first could be complete, but you won't print them until the first one completes, and you will continue printing them in order.

On a slightly different note, you could rewrite this little more succinctly like so:

var screenLock = new object();
var results = Enumerable.Range(1, 10000)
        .AsParallel()
        .Select(i => {
            // I wouldn't actually use this printing, but it should help you understand your example a bit better
            lock (screenLock) {
                Console.WriteLine("Task i"); 
            }
            apiRequest(requestedData));
        });

Without the printing, it looks like this:

var results = Enumerable.Range(1, 10000)
        .AsParallel()
        .Select(i => apiRequest(requestedData));
Steven Evers
  • 16,649
  • 19
  • 79
  • 126
  • I don't think this really addresses the issue. We don't have any details as to how OP is measuring the time to complete the tasks. Moreover, I believe there is an implicit assumption that the tasks take roughly equal time to complete, which would mean parallelization results in less time overall. Waiting for any incomplete tasks to be complete before continuing actually expresses the OP's intent, which is for *all* tasks to be completed in a specified timeframe. – theMayer Aug 27 '15 at 15:42
  • Thank you, marking as answer. Problem was that in my iteratin. See my edit for code used. – stkxchng Aug 28 '15 at 08:19