0

I'd like to spawn some threads and in each thread sequentially make calls to an API and aggregate the results (some sort of stress testing). Here is my attempt:

private async Task DoWork()
{
    var allResponses = new List<int>();
    for (int i = 0; i < 10; i++)
    {
        await Task.Run(() =>
        {
            var responses = Enumerable.Range(0, 50).Select(i => CallApiAndGetStatusCode());
            allResponses.AddRange(responses);
        });
    }

    // do more work
}


private int CallApiAndGetStatusCode()
{
    try
    {
        var request = new HttpRequestMessage(httpMethod.Get, "some url");

        var responseResult = httpClient.SendAsync(request).Result;
        return (int)responseResult.StatusCode;
    }
    catch (Exception e)
    {
        logger.LogError(e, "Calling API failed");
    }
}

However, this code always ends up the catch with the inner exception being {"A task was canceled."}. What am I doing wrong here?

havij
  • 1,030
  • 14
  • 29
  • 5
    I can't explain your concrete exception, but here are two observations: (1) your code is not thread safe (you should not `.AddRange` from potentially different threads) and (2) since you already use async/await, then why `CallApiAndGetStatusCode` is not an async method? In fact you already use `SendAsync` inside, just await it. Accessing `.Result` instead can deadlock, since you are already on the threadpool (maybe that's why it gets cancelled). – freakish Feb 04 '23 at 07:28
  • 1
    @freakish AFAICS the `allResponses.AddRange` is called in a sequential fashion. I can't see any thread-safety issues. The code is a bit hairy and could certainly be improved though. – Theodor Zoulias Feb 04 '23 at 08:58
  • 1
    @havij you need to refactor your code first. Instead of wrapping your calls in `Task.Run()` your `CallApiAndGetStatusCode()` should return `async Task` and call HttpClient with `await`, instead of using `.Result` – OlegI Feb 04 '23 at 09:11
  • @TheodorZoulias The documentation states `Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe. `. This means that `AddRange` is not thread safe. You need a `ConcurrentBag` for that. I wouldn't use it within an await just to be sure. You are probably right and nothing will happen, but still –  Feb 04 '23 at 09:37
  • 1
    @TheodorZoulias even though it is called in sequential fashion, it is called from different threads. In particular, I think it is possible for CPU cores to cache different pieces of data (for example before and after internal resize). As a rule of thumb: always use thread safe structures when sharing between threads, regardless of access patterns. – freakish Feb 04 '23 at 15:28
  • @freakish check out [this](https://stackoverflow.com/questions/28871878/are-you-there-asynchronously-written-value/28872058#28872058 "Are you there, asynchronously written value?") question. Quote: *"the TPL includes the appropriate barriers when tasks are queued, and at the beginning/end of task execution"*. If that wasn't the case, nothing would work correctly in async/await. – Theodor Zoulias Feb 04 '23 at 15:54
  • @TheodorZoulias do memory barriers imply cache consistency? Or thread safety in general? Aren't memory barriers only about instructions reordering? I fail to see how this is related. Async/await still suffers from most (if not all) thread safety issues, I'm not sure what you mean by "nothing would work in async/await". – freakish Feb 04 '23 at 16:24
  • @freakish yes, the memory barriers imply cache consistency. If that wasn't the case, nothing would work correctly in multithreading in general. The `lock` ensures consistent memory access by adding barriers. There is no other magic going on. Take a look at this: [Memory barrier generators](https://stackoverflow.com/questions/6581848/memory-barrier-generators). Or [this](https://stackoverflow.com/questions/22457501/thread-memorybarrier-and-lock-difference-for-a-simple-property). – Theodor Zoulias Feb 04 '23 at 16:32
  • @freakish regarding the memory-related mechanics, I would suggest to read posts written by Peter Cordes who is a hardware expert. Like [this post](https://stackoverflow.com/questions/66488734/should-interlocked-compareexchange-also-a-volatile-variable/66490395#66490395), which is super informational and deep. (probably a bit *too* informational!) – Theodor Zoulias Feb 04 '23 at 16:40
  • @TheodorZoulias i don't know why you keep claiming that memory barriers are necessary for things to work. There are multiple other thread primitives out there. Also none of the links answers my specific question (about cache-memory consistency). After googling a bit, I think that whether a memory barrier enforces cache consistency is actually a hardware detail (it does seem to be true for x64). I don't know whether C# standard guarantees that. Either way, there are too many unknowns here for me to risk it. – freakish Feb 04 '23 at 17:06
  • In particular I don't see a reason for a separate "make caches consistent" instruction on a CPU to exist and be used by atomics, locks, etc. but not necessarily by MemoryBarrier() call. The question is really about guarantees. – freakish Feb 04 '23 at 17:16
  • @freakish I've read a large part of the .NET source code in the `System.Threading`, `System.Collections.Concurrent` and related namespaces. That's where my confidence comes from, that if memory barriers weren't enough for the threads to have a consistent view of the memory, nothing would work. I understand perfectly well your skepticism, because I have passed from exactly this stage a few years earlier when I was learning multithreading. But seeing with my own eyes how some of the best software engineers in the world write code, helped immensely at evaporating these fears and uncertainties. – Theodor Zoulias Feb 04 '23 at 17:25
  • 2
    No, allResponses.AddRange() is not thread-safe. But not likely the cause of the exception, threading race bugs are far more random. Use the debugger to find out where the exception is thrown. Debug > Windows > Exception Settings, tick the checkbox for CLR Exceptions. – Hans Passant Feb 04 '23 at 17:31
  • @freakish you might be interested in a question that I posted on GitHub last year: [Confusion regarding Volatile.Read guarantees, and example in the volatile docs](https://github.com/dotnet/runtime/issues/67330). – Theodor Zoulias Feb 04 '23 at 17:31
  • 1
    @TheodorZoulias yeah, well, that's not an easy topic. Thanks, I will definitely read everything you've linked. Although I will likely remain skeptical. :) Still, it doesn't really matter for me, I would just use thread safe list an be done with. – freakish Feb 04 '23 at 17:40

2 Answers2

2

There is no benefit to using either Enumerable.Range or .AddRange in your example, since you do not need the seeded number. Your code must be converted to async/await to avoid deadlocks and in doing so, you can simply loop inside of each task and avoid any odd interactions between Enumerable.Select and await:

private async Task DoWork()
{
    var allTasks = new List<Task>(10);
    var allResponses = new List<int>();
    
    for (int i = 0; i < 10; i++)
    {
        allTasks.Add(Task.Run(async () =>
        {
            var tempResults = new List<int>();
            
            for (int i = 0; i < 50; i++)
            {
                var result = await CallApiAndGetStatusCode();
                if (result > 0) tempResults.Add(result);
            }
            
            if (tempResults.Count > 0)
            {
                lock (allResponses)
                {
                    allResponses.AddRange(tempResults);
                }
            }
        }));        
    }

    await Task.WhenAll(allTasks);

    // do more work
}


private async Task<int> CallApiAndGetStatusCode()
{
    try
    {
        var request = new HttpRequestMessage(HttpMethod.Get, "some url");

        var responseResult = await httpClient.SendAsync(request);
        return (int)responseResult.StatusCode;
    }
    catch (Exception e)
    {
        logger.LogError(e, "Calling API failed");
    }
    
    return -1;
}

Note that this code is overly protective, locking the overall batch before adding the temp results.

David L
  • 32,885
  • 8
  • 62
  • 93
  • The question asks: *"sequentially make calls to an API and aggregate the results"*. I think that your answer makes use of concurrency. Am I wrong? – Theodor Zoulias Feb 04 '23 at 15:57
  • @TheodorZoulias the OP specifically states “spawn some threads _and in each thread sequentially_ make calls.” I view the sequential statement as per thread, not overall. – David L Feb 04 '23 at 18:20
  • Hmm yes, you are probably right. Your answer seems to have the behavior that the OP wants, contrary to the behavior of their actual code. – Theodor Zoulias Feb 04 '23 at 18:25
  • @DavidL thanks this works mostly, but in some runs, tasks again start getting canceled with message `The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.`. Does this have anything to do with how HttpClient is used? – havij Feb 06 '23 at 19:48
  • This is run in an Azure Function, `httpClientFactory` is injected in the ctor of the class containing this code and I get a HttpClient out of it with `CreateClient()`. The class itself is registered with scoped lifetime. – havij Feb 06 '23 at 19:48
  • @havij It is more likely that you’re overwhelming your target service. – David L Feb 06 '23 at 20:31
0

I changed your code to this and work

async Task DoWork()
{
    var allResponses = new List<int>();
    for (int i = 0; i < 10; i++)
    {
        await Task.Run(() =>
        {
            var responses = Enumerable.Range(0, 3).Select(i => CallApiAndGetStatusCodeAsync());
            allResponses.AddRange(responses.Select(x => x.Result));
        });
    }

    // do more work
}

async Task<int> CallApiAndGetStatusCodeAsync()
{
    try
    {
        var request = new HttpRequestMessage(HttpMethod.Get, "http://www.google.com");
        var responseResult = await httpClient.SendAsync(request);
        return (int)responseResult.StatusCode;
    }
    catch (Exception e)
    {
        logger.LogError(e, "Calling API failed");
        return -1;
    }
}
thisisnabi
  • 1,045
  • 9
  • 21
  • Your code does not change much, you still use thread-blocking calls here `responses.Select(x => x.Result)` – OlegI Feb 04 '23 at 09:23
  • Probably because of the timeout that happens on the request. In any case, when you have a async request, it is better if the function is also of async type – thisisnabi Feb 04 '23 at 09:34