.Net (C#) Async x Parallel Performance

Question

.Net is very helpful to program asynchronous and parallel. There are some good articles about that like this

But what is the best and in which context? I have just created a simple console application to try to figure out. It looks like async and await is faster, but it consumes more memory. This was the average result repeating the test three times in the same machine (CPU = AMD FX 8120 Eight-Core Processor 3,1GHz and RAM = 16GB):

Asynchronous Elapsed Time(s) = 23,0841498667 | Memory (MB) = 62154

Parallel Elapsed Time(s) = 107,9682892667 | Memory (MB) = 27828

The code is in GitHub and it is very simple.

The code requests webpages 250 times.

The Asynchronous test is that:

    public async static Task AsynchronousTest()
    {
        List<Task<string>> taskList = new List<Task<string>>();

        for (int i = 0; i < TOTAL_REQUEST; i++)
        {
            Task<string> taskGetHtmlAsync = GetHtmlAsync(i);
            taskList.Add(taskGetHtmlAsync);
        }

        await Task.WhenAll(taskList);

        //Trying to free memory
        taskList.ForEach(t => t.Dispose());
        taskList.Clear();
        taskList = null;
        GC.Collect();
    }

    public async static Task<string> GetHtmlAsync(int i)
    {
        string url = GetUrl(i);

        using (HttpClient client = new HttpClient())
        {
            string html;

            html = await client.GetStringAsync(url);

            Trace.WriteLine(string.Format("{0} - OK (Html Length {1})", i + 1, html.Length));
            return html;
        }
    }

And the parallel test is this:

    public static void ParallelTest()
    {
        Parallel.For(0, TOTAL_REQUEST, i =>
        {
            string html = GetHtml(i);
        });
    }

    public static string GetHtml(int i)
    {
        string html = null;

        string url = GetUrl(i);

        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

        using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
        {
            using (Stream receiveStream = response.GetResponseStream())
            {
                using (StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8))
                {
                    html = readStream.ReadToEnd();
                }
            }
        }

        Trace.WriteLine(string.Format("{0} - OK (Html Length {1})", i + 1, html.Length));

        return html;
    }

So, is there any way to improve the memory performance of the async/await method?

Your parallel test probably shouldn't be using async methods - that seems wrong. I would imagine a proper parallel test would be faster than async (if only slightly). The tasks test is using more memory because you're storing 250 tasks until they all complete, including the downloaded HTML. The parallel test grabs the HTML and then throws it away immediately, and will only spin up a few threads at a type (typically the amount of cores on your computer). — Rob, Jan 08 '16 at 01:27
The parallel test is really not using the async, that is for simulate a synchronous parallel x a asynchronous programming. — mqueirozcorreia, Jan 08 '16 at 01:34
Your testing technique is plain wrong: you can't expect the server to allow you to connect 50 (for example) times in parallel. You should only test against local stuff that can be done by your computer alone, and then again, it will still be wrong. You can't really say "parallel is faster" (or slower) because that depends entirely on each case. — Camilo Terevinto, Jan 08 '16 at 01:36
It actually *is* using the `async` framework, though. You're spinning up tasks and blocking on them, needlessly. To properly compare threads vs `async`, you should not force the parallel test to use `async` (*even* if you immediately block on it). — Rob, Jan 08 '16 at 01:36
The asynchonous needs to store the tasks in a list in order to call "await Task.WhenAll(taskList);" and wait to all tasks to finish. Do you have a better idea? Anyway, I called a bunch of things to Dispose all the tasks, clear the list, set the list to null and GC.Collect (I tryied really hard). But you made a good question, I'll inspect memory in the diagnostics — mqueirozcorreia, Jan 08 '16 at 01:37
@mqueirozcorreia It depends on what you want to do next. If you want to wait until all requests complete and *then* process them, then you will need to store all the results. If you want to process them as they're coming in (say, save them to a file), then you can do this, for example, `var html = await GetHtmlAsync(i); SaveDataToFile(html);`. You don't really need to store the tasks now, or use `WhenAll`. — Rob, Jan 08 '16 at 01:40
@Rob, just to make myself clear, I have created a Synchronous test too. The synchronous Elapsed Time in second is 05 minutes and 29.6450018(s) and the memory in MBytes is 26936 So using Async or Parallel are both working and much faster than using the syncronous one. Here is the code for the synchronous: public static void SynchronousTest() { for (int i = 0; i < TOTAL_REQUEST; i++) { string html = GetHtmlAsync(i).Result; } } — mqueirozcorreia, Jan 08 '16 at 01:49
@mqueirozcorreia I understand that, but my point is that even though it is *synchronous*, you are creating superfluous `Task`s which you are then blocking on. A *true* synchronous test would be using `WebRequest` (for example) instead of `HttpClient` so that you do *not* need to create a task which is immediately blocked. Since you are benchmarking performance *and* memory, you will get skewed results as you are leveraging the `async` framework, which has overhead. Your parallel/sync tests should *not* be calling `GetHtmlAsync(i)`, but `GetHtml(i)` which has a non-async implementation. — Rob, Jan 08 '16 at 01:52
@Rob, I just compiled the code using "var html = await GetHtmlAsync(i);" and it works just like the synchronous. If you want to see, I can commit the source at github — mqueirozcorreia, Jan 08 '16 at 01:53
@cFrozenDeath the idea behind the request is to simulate a resource the froze the main thread. If you want, I can share a new code using "await Task.Delay(2000);". — mqueirozcorreia, Jan 08 '16 at 02:02
@Rob, as you suggested, I just created the GetHtml(i) using the WebRequest. I'll run the program three times and update the question. It tooked more time and consumed less memory. As soon as I update github, I'll let you know — mqueirozcorreia, Jan 08 '16 at 02:14
@Rob, I updated the question and the code in Github. I appreciate your help because it is my first question here. Anything else do you see that I need to do to ensure that my question is well asked and formed? — mqueirozcorreia, Jan 08 '16 at 02:27
What happens if you wrap the tasks in using statements instead of explicitly disposing of them after? — D. Ben Knoble, Jan 08 '16 at 02:49
You should take this advice from Eric Lippert: [If you have two horses and you want to know which of the two is the faster then race your horses](http://ericlippert.com/2012/12/17/performance-rant/). — Enigmativity, Jan 13 '16 at 00:29
@Enigmativity, it is really a nice advice. The main reason for this post is because I have a app that today uses parallel with a lot of webrequests inside. I was really trying to find out what has better performance in my environment and I have found the httpClient memory leak. Anyway, I hope the code help somebody else too. — mqueirozcorreia, Jan 13 '16 at 01:36

score 1 · Answer 1 · answered Jan 08 '16 at 03:28

It's using more memory because you're keeping the tasks and results in memory, while in the parallel test, you're not.

To be more equivalent, the parallel test might look like this:

public static void ParallelTest()
{
    List<string> results = new List<string>();
    Parallel.For(0, TOTAL_REQUEST, i =>
    {
        string html = GetHtml(i);
        lock(results)
            results.Add(html);
    });
}

I would expect the parallel test to still consume less memory, but not by as much as you currently have.

If you want to download the results, and process them only after all requests have completed, then there isn't much you can do to speed it up.

However, if you're able to immediately process results, you're in a much better position. Take this for example:

public async static Task AsynchronousTest()
{
    for (int i = 0; i < TOTAL_REQUEST; i++)
    {
        var result = await GetHtmlAsync(i);
        File.WriteAllText($"SomeFile {i}.txt", result);
    }
}

Your memory consumption will be drastically reduced, as you're not keep all the requests in memory any more; you're processing them and discarding them as you go.

Of course, the parallel equivalent might be:

public static void ParallelTest()
{
    Parallel.For(0, TOTAL_REQUEST, i =>
    {
        string html = GetHtml(i);
        File.WriteAllText($"SomeFile {i}.txt", html);
    });
}

Onto the reported times you're seeing. I find it very surprising that async is much faster than a threaded approach. Turns out, it's because of caching. Let's turn that off:

GetHtmlAsync

using (HttpClient client = new HttpClient(new WebRequestHandler { CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.BypassCache) }))

GetHtml

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.BypassCache);

Threading is 2-3x faster (for 10 items). Sounds a bit more reasonable. The thing is, async is typically used to run processes in the background so as to not block the UI thread. Parallel.For, however, is smarter about the threads is spins up, and is typically the number of cores on your computer (+- other factors).

In conclusion:

Threading is probably the better approach here. It's the better tool for the job (where you know you're mainly limited by Net/IO, rather than CPU). You're seeing a higher memory count because of the async overhead as well as storing the results, which I mentioned above.

thanks for your answer. Anyway the problem with the code `var result = await GetHtmlAsync(i);` is that it works differently from queuing task in the list. When the `for` loop iterate in the suggested line (with await), it stops until the `GetHtmlAsync` method is completed. With the list of tasks all the tasks is started by the main thread with no stop. Anyway, I have changed the code to turn the caching off and I still got almot the same result : Async: Elapsed Time : 00:00:21.7974182 Memory : 51460MB Parallel: Elapsed Time : 00:02:19.9143067 Memory : 35272MB Code is updated in Github — mqueirozcorreia, Jan 08 '16 at 22:17
I agree with you that creating the list of tasks will consume more memory. But, how to free the memory after the task has finished their job? — mqueirozcorreia, Jan 08 '16 at 22:19
I have been searching for the answers and I think I got it all! Thanks for you help and I have post the solution here if it may help you too. — mqueirozcorreia, Jan 13 '16 at 00:22

score 0 · Accepted Answer · edited May 23 '17 at 12:30

I have found this link showing that the short lived object HttpClient will leave the memory Leak, so that is the problem with HttpClient using inside a for loop.

I have also improved the test to use a local Asp.net 5 webapi async method to answer the request(In that way the test is not impacted for the web servers, for any reason):

[Route("api/[controller]")]
public class ValuesController : Controller
{
    [HttpGet]
    public async Task<IEnumerable<string>> Get()
    {
        await Task.Delay(1000);

        return new string[] { "value1", "value2" };
    }
}

I have run the test in some different context and the results are on this table:

+--------------------------------------+------------+------------------+------------------+ | | Request | 100 | 500 | +---------------------------------------+------------+------------------+------------------+ | synchronous | Elapsed(s) | 101,5830302 | 507,6880514 | | | Memory(MB) | 15390,6666666667 | 18693,3333333333 | | | Threads | 6 | 6,6666666667 | | asynchronous (WebRequest) | Elapsed(s) | 9,0738118667 | 25,3127221 | | | Memory(MB) | 18530,6666666667 | 20276 | | | Threads | 27 | 42,6666666667 | | asynchronous (Httpclient) | Elapsed(s) | 1,2210176 | 1,6143776333 | | | Memory(MB) | 20880 | 30293,3333333333 | | | Threads | 28 | 29 | | asynchronous (Httpclient Memory Leak) | Elapsed(s) | 1,4515700667 | 1,4114106 | | | Memory(MB) | 21334,6666666667 | 32070,6666666667 | | | Threads | 28,3333333333 | 29 | | parallel (WebRequest) | Elapsed(s) | 12,1617405333 | 31,6620354333 | | | Memory(MB) | 18441,3333333333 | 20004 | | | Threads | 25 | 37 | | parallel (HttpClient) | Elapsed(s) | 15,9617157667 | 31,8756134 | | | Memory(MB) | 19612 | 14834,6666666667 | | | Threads | 36,3333333333 | 41,6666666667 | +---------------------------------------+------------+------------------+------------------+

The fastest result was using asynchronous and HttpClient. The good performance with good memory was using parallel and HttpClient.

Are you sure the requests are not being cached? The results show 1.45 seconds for 100 requests and 1.41 seconds for 500 requests (with http client & async). Parallel should vastly out-perform async for 500 full requests (uncached), as async is still one at a time, but doesn't block the UI. Parallel could have up to (--NumCores [more or less]--, for example 8) at a time. 16 seconds for 100 requests seems far more likely than 1.5 seconds for 100 requests. — Rob, Jan 13 '16 at 12:52

.Net (C#) Async x Parallel Performance

2 Answers2

Linked