-2

I am rewriting an old app and I am trying to use async to speed it up.

The old code was doing something like this:

var value1 = getValue("key1");
var value2 = getValue("key2");
var value3 = getValue("key3");

where the getValue function was managing its own cache in a dictionary, doing something like this:

object getValue(string key) {
  if (cache.ContainsKey(key)) return cache[key];
  var value = callSomeHttpEndPointsAndCalculateTheValue(key);
  cache.Add(key, value);
  return value;
}

If I make the getValue async and I await every call to getValue, then everything works well. But it is not faster than the old version because everything is running synchronously as it used to.

If I remove the await (well, if I postpone it, but that's not the focus of this question), I finally get the slow stuff to run in parallel. But if a second call to getValue("key1") is executed before the first call has finished, I end up with executing the same slow call twice and everything is slower than the old version, because it doesn't take advantage of the cache.

Is there something like await("key1") that will only await if a previous call with "key1" is still awaiting?

EDIT (follow-up to a comment)

By "speed it up" I mean more responsive.

For example when the user selects a material in a drop down, I want to update the list of available thicknesses or colors in other drop downs and other material properties in other UI elements. Sometimes this triggers a cascade of events that requires the same getValue("key") to used more than once.

For example when the material is changed, a few functions may be called: updateThicknesses(), updateHoleOffsets(), updateMaxWindLoad(), updateMaxHoleDistances(), etc. Each function reads the values from the UI elements and decides whether to do its own slow calculations independently from the other functions. Each function can require a few http calls to calculate some parameters, and some of those parameters may be required by several functions.

The old implementation was calling the functions in sequence, so the second function would take advantage of some values cached while processing the first one. The user would see each section of the interface updating in sequence over 5-6 seconds the first time and very quickly the following times, unless the new value required some new http endpoint calls.

The new async implementation calls all the functions at the same time, so every function ends up calling the same http endpoints because their results are not yet cached.

stenci
  • 8,290
  • 14
  • 64
  • 104
  • 1
    "*I am trying to use async to speed it up*" Normally async/await is used to make an application more responsive or scalable, not faster. Can you show how you intend to use the async version the `getValue` method (`getValueAsync`?), in order to speed up the application? – Theodor Zoulias Jul 01 '21 at 16:12
  • 1
    Download free [Task-based Asynchronous Pattern](https://www.microsoft.com/en-us/download/details.aspx?id=19957) small book by Stephen Toub. There is an AsyncCache implementation there. – Alexander Petrov Jul 01 '21 at 16:29
  • 1
    You might be interested in something like [AsyncLazy](https://devblogs.microsoft.com/pfxteam/asynclazyt/). – John Wu Jul 01 '21 at 16:33
  • 1
    A simple method is to cache the task objects instead of the values from them. – Lasse V. Karlsen Jul 01 '21 at 16:41
  • You could use a `ConcurrentDictionary` with a `GetOrAddAsync` implementation that accepts asynchronous delegates (`Func>`). There are some implementations of this method [here](https://stackoverflow.com/questions/54117652/concurrentdictionary-getoradd-async "ConcurrentDictionary GetOrAdd async"). – Theodor Zoulias Jul 01 '21 at 16:47

1 Answers1

2

A simple method is to cache the tasks instead of the values, this way you can await both a pending task and an already completed task to get the values.

If several parallel tasks all try to get a value using the same key, only the first will spin off the task, the others will await the same task.

Here's a simple implementation:

private Dictionary<string, Task<object>> cache = new();
public Task<object> getValueAsync(string key)
{
    lock (cache)
    {
        if (!cache.TryGetValue(key, out var result))
            cache[key] = result = callSomeHttpEndPointsAndCalculateTheValueAsync(key);

        return result;
    }
}

Judging by the comments the following example should probably not be used.

Since [ConcurrentDictionary]() has been mentioned, here's a version using that instead.
private ConcurrentDictionary<string, Task<object>> cache = new();
public Task<object> getValueAsync(string key)
{
    return cache.GetOrAdd(key, k => callSomeHttpEndPointsAndCalculateTheValueAsync(k));
}

The method seems simpler and that alone might be grounds for switching to it, but in my experience the ConcurrentDictionary and the other ConcurrentXXX collections seems to have their niche use and seems somewhat more heavyhanded and thus slower for the basic stuff.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • The cache dictionary doesn't need to be a ConcurrentDictionary, because it is only used by this function and there is no risk that it is accessed/modified unexpectedly, right? – stenci Jul 01 '21 at 16:51
  • No. I mean, it wouldn't hurt to switch to a ConcurrentDictionary instead, but all performance measurements I've tried seems to indicate that for simple stuff the basic lock approach performs better. If you'd like I can post a version using that ConcurrentDictionary instead and you can try them both. – Lasse V. Karlsen Jul 01 '21 at 16:52
  • I'm new to C# and the way I understand it, if the `cache` variable is only used by one function and that function has no `await`, then it can be considered thread safe. Is this assumption wrong? (Should this be another post?) – stenci Jul 01 '21 at 16:55
  • @stenci That it's only called from one method isn't an issue, what matters is whether it's ever called from multiple threads at the same time. The one calling method may or may not be called from multiple threads at the same time. – Servy Jul 01 '21 at 17:15
  • GetOrAdd doesn't guarantee that the delegate is only called once per key, and that's precisely the property the question is asking this cache to implement. – Servy Jul 01 '21 at 17:16