1

Let's say I have a scoped service called IPackageLoader with one simple method GetAsync(string packageName) that enables us to retrieve a package based on one HTTP GET call.

Since it is a HTTP GET call, it is idempotent, so there's no point in making multiple requests if GetAsync(string packageName) gets called multiple times.

Example:

packageLoader.GetAsync("dotnet/aspnetcore")

packageLoader.GetAsync("dotnet/aspnetcore")

packageLoader.GetAsync("dotnet/efcore")

If there isn't some lock mechanism here, the second request will be made while it could just wait for the first one, save it and return it. However, the third request, since it is to get some other package, could be made without awaiting.

I tried to use SemaphoreSlim but I failed, since I locked on the first request, and the second one would wait for the first one to arrive as desired. However, if I tried to get a different package, using SemaphoreSlim I'll need to wait unnecessarily.

Any ideas on how to achieve this behavior? Maybe some class from the library that allows me to lock based on some key - instead just locking while any request is being done?

Nelson Sousa
  • 497
  • 6
  • 14
  • Instead of locking on it can you cache it? – ChiefTwoPencils Dec 15 '21 at 21:40
  • Something like this? [Caching in .NET](https://learn.microsoft.com/en-us/dotnet/core/extensions/caching) – Theodor Zoulias Dec 15 '21 at 21:43
  • Cache can avoid this problem, but not completely. The second request will still be made while the first one is still resolving... So it is not a perfect solution. Let's say I create 100 tasks in parallel to get the same item, I'd end up with 100 requests because the first one wasn't fast enough to receive the response and save it on cache (example) – Nelson Sousa Dec 15 '21 at 21:44
  • 2
    Cache the requests (as well as the results). Then check the request cache to see if a request has already been made. Also GET being idempotent doesn't mean that the result will never change - doing get on this web page changed after I made this comment. – SBFrancies Dec 15 '21 at 21:45
  • Yes, but this is a scoped service. The main goal is to avoid the same requests while serving a single request – Nelson Sousa Dec 15 '21 at 21:46
  • These two questions might be relevant: [Async threadsafe Get from MemoryCache](https://stackoverflow.com/questions/31831860/async-threadsafe-get-from-memorycache) and [Stop Reentrancy on MemoryCache Calls](https://stackoverflow.com/questions/65640644/stop-reentrancy-on-memorycache-calls) – Theodor Zoulias Dec 15 '21 at 21:53
  • It may be possible to embed a `Dictionnary` (or concurrent equivalent) registered as singleton in your IPackageLoader. So that if the package is first requested the dictionary is filled with the address requested and the associated Task by example. So the second request wait for the task to complete instead of making the call. – Hazrelle Dec 15 '21 at 22:04
  • If you prefer to use a dictionary instead of a full fledged `MemoryCache`, take a look at this: [ConcurrentDictionary GetOrAdd async](https://stackoverflow.com/questions/54117652/concurrentdictionary-getoradd-async) – Theodor Zoulias Dec 15 '21 at 22:11
  • You are overthinking this. Add caching and stop worrying about the same call being called twice simultaneously. In the readl world, it will probably never happen unless you are Facebook. – Neil Dec 15 '21 at 22:38
  • "create 100 tasks in parallel" - this is not a real world situation. Even if these requests were separated by 100ms, then the 2nd request would be fulfilled by the cache. – Neil Dec 21 '21 at 14:48

2 Answers2

0

You should cache the request(packageName) also the actual task (not the result)

simplified e.g:

var cache = new CustomCache();
var loader = new PackageLoader(cache);

// simulating 
var t1 = loader.GetAsync("lib1");
var t2 = loader.GetAsync("lib1");
var t3 = loader.GetAsync("lib2");
var t4 = loader.GetAsync("lib1");

await Task.WhenAll(t1,t2,t3,t4);

// lifetime scope
class PackageLoader
{
    private CustomCache _cache;
    
    public PackageLoader(CustomCache cache)
    {
        _cache = cache;
    }
    public Task GetAsync(string packageName)
    {
        return _cache.Dic.GetOrAdd(packageName, q => HttpCall(q));
    }
    
    private async Task HttpCall(string packageName)
    {
        var rand = new Random();    
        // simulating a http call
        await Task.Delay(rand.Next(1, 3) * 1000);
        Console.WriteLine($"{packageName} is loaded");
    }
}

// lifetime singleton
class CustomCache
{
    public ConcurrentDictionary<string,Task> Dic = new();  
    
    // some cleanup mechanism, probably in a time preiod
}

this way you only have 1 HTTP call for each package. just make sure to clean the cache when the task completed

lib1 is loaded
lib2 is loaded
AliReza Sabouri
  • 4,355
  • 2
  • 25
  • 37
-2

I think I was able to do it using SemaphoreSlim and a Dictionary (this is an example using a Console Application .NET 6 template - no Main method needed):

var dataLoader = new DataLoader();

Console.WriteLine("Executing...");

var t1 = dataLoader.LoadAsync("1");
var t2 = dataLoader.LoadAsync("1");
var t3 = dataLoader.LoadAsync("9999");

Console.WriteLine("Done Loading...");

Console.WriteLine(t1 == t2);
Console.WriteLine(t1 == t3);

Console.WriteLine(await t1);
Console.WriteLine(await t2);
Console.WriteLine(await t3);

public class DataLoader
{
    private readonly SemaphoreSlim semaphoreSlim = new SemaphoreSlim(1, 1);

    private Dictionary<string, Task<string>> Data { get; set; }

    public DataLoader()
    {
        this.Data = new Dictionary<string, Task<string>>();
    }

    public Task<string> LoadAsync(string id)
    {
        this.semaphoreSlim.Wait();

        try
        {
            if (this.Data.TryGetValue(id, out var dataTask))
            {
                return dataTask;
            }

            dataTask = GetString(id);

            this.Data.Add(id, dataTask);

            return dataTask;
        }
        finally
        {
            this.semaphoreSlim.Release();
        }
    }

    private async Task<string> GetString(string str)
    {
        await Task.Delay(5000);

        return str;
    }
}

Here's the result (notice that t1 is equal to t2):

Executing...
Done Loading...
True
False
1
1
9999
Nelson Sousa
  • 497
  • 6
  • 14
  • How will this work when your code is running on 2 servers? – Neil Dec 15 '21 at 22:39
  • It is not running on 2 servers. It's one server calling multiple times one API... – Nelson Sousa Dec 15 '21 at 22:40
  • If it's not going to ever run on 2 servers, then you probably won't even get this problem in the first place, outside of testing. It's a non-issue. – Neil Dec 15 '21 at 22:41
  • 2
    @Neil: Making a redundant HTTP request isn't a non-issue. – Eric J. Dec 15 '21 at 22:48
  • OP is not asking about redundant HTTP requests, it's about 2 HTTP requests making the same request on the same data source at the exact same time. With caching, it's a non-issue because it will probably never happen outside of testing. – Neil Dec 15 '21 at 22:51
  • 1
    No, I am actually asking about redundant HTTP requests. Like my API receives one request from the client and performs 2 equal HTTP requests to some other API to get resources. That's what I am trying to avoid. – Nelson Sousa Dec 15 '21 at 22:54
  • And cache isn't enough. I'll end up performing the 2 redundant HTTP requests because when the second one is fired the first one isn't yet complete and saved on cache. – Nelson Sousa Dec 15 '21 at 22:55