1

I have a loop per user that does a light weight task created using Task.Factory.StartNew(). The task in the loop does a lightweight job and sleeps for few seconds.

Does this kind of code work if thousands or millions of users are working at the same time? Will managing these many threads create a significant load on the server? Is it better to have a single thread doing this job for all users?

This is what happens in the code currently,

Task.Factory.StartNew(() =>
{
  // begin loop
  // synchronous web call
  // process and update result in DB
  // exit condition
  // sleep for few minutes
  // end loop
}
Mahesh P.
  • 51
  • 7
  • Is there a way to avoid the task sleeping? Could you shift to an asynchronous pattern rather than pausing the thread? Do you need that delay and if so why? A lot of threads sleeping is more viable than one thread with repeated sleeps but it is very unusual to have a situation where sleeping is the best option. For threads, I mean. For me sleeping is a great option. – glenatron Apr 23 '20 at 10:28
  • Can you show an example of what you're currently doing? – WBuck Apr 23 '20 at 10:32
  • @WBuck I have added the pseudo code of what is happening currently – Mahesh P. Apr 23 '20 at 13:14
  • @glenatron the delay is there to get the updates from web call after some time. It is a polling loop. – Mahesh P. Apr 23 '20 at 13:16
  • Short answer - small task per user otherwise it isn't scalable and by that I mean moving its processing to other servers as the load increases. – ChrisBD Apr 23 '20 at 13:17
  • @MaheshP. rather than trying to treating an asynchronous task a synchronous way, you would almost certainly be better off treating it as asynchronous: https://learn.microsoft.com/en-us/dotnet/standard/asynchronous-programming-patterns/polling-for-the-status-of-an-asynchronous-operation – glenatron Apr 23 '20 at 14:11
  • @glenatron the link is for code that polls and waits for async task to complete. My use case is where I _have_ to poll every n-minutes to get latest data but I am doing it per user. – Mahesh P. Apr 23 '20 at 15:09
  • It seems that you use the terms "task" and "thread" interchangeably. This is quite confusing, because [these things are different](https://stackoverflow.com/questions/13429129/task-vs-thread-differences). – Theodor Zoulias Apr 23 '20 at 18:58

1 Answers1

1

You can have millions of long-running tasks, but you can't have millions of long-running threads (unless you own a machine with terabytes of RAM, since each thread allocates 1 MB). The way to have so many tasks is to make them async. Instead of having them sleeping with Thread.Sleep, you can have them awaiting asynchronously a Task.Delay. Here is an example:

var cts = new CancellationTokenSource();
CancellationToken ct = cts.Token;
Task[] tasks = Enumerable.Range(1, 1_000_000).Select(index => Task.Run(async () =>
{
    await Task.Delay(index, ct); // Initial delay to spread things out
    while (true)
    {
        var webResult = await WebCallAsync(index, ct); // asynchronous web call
        await DbUpdateAsync(webResult, ct); // update result in DB
        await Task.Delay(1000 * 60 * 10, ct); // do nothing for 10 minutes
    }
})).ToArray();
Task.WaitAll(tasks);

The purpose of the CancellationTokenSource is for cancelling all tasks at any time by calling cts.Cancel(). Combining Task.Delay with cancellation creates an unexpected overhead though, because the cancellation is propagated through OperationCanceledException exceptions, and one million exceptions cause considerable stress to the .NET infrastructure. In my PC the overhead is about 50 seconds of 100% CPU consumption. If you do like the idea of using CancellationTokens, a workaround is to use an alternative Task.Delay that doesn't throw exceptions. Here is an implementation of this idea:

/// <summary>Returns a <see cref="Task"/> that will complete with a result of true
/// if the specified number of milliseconds elapsed successfully, or false
/// if the cancellation token is canceled.</summary>
private static async Task<bool> NonThrowingDelay(int millisecondsDelay,
    CancellationToken cancellationToken = default)
{
    if (cancellationToken.IsCancellationRequested) return false;
    if (millisecondsDelay == 0) return true;
    var tcs = new TaskCompletionSource<bool>();
    using (cancellationToken.Register(() => tcs.TrySetResult(false)))
    using (new Timer(_ => tcs.TrySetResult(true), null, millisecondsDelay, Timeout.Infinite))
        return await tcs.Task.ConfigureAwait(false);
}

And here is how you could use the NonThrowingDelay method for creating 1,000,000 tasks that can be canceled (almost) instantly:

var cts = new CancellationTokenSource();
CancellationToken ct = cts.Token;
Task[] tasks = Enumerable.Range(1, 1_000_000).Select(index => Task.Run(async () =>
{
    if (!await NonThrowingDelay(index, ct)) return; // Initial delay
    while (true)
    {
        var webResult = await WebCallAsync(index, ct); // asynchronous web call
        await DbUpdateAsync(webResult, ct); // update result in DB
        if (!await NonThrowingDelay(1000 * 60 * 10, ct)) break; // 10 minutes
    }
})).ToArray();
Task.WaitAll(tasks);
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • Bravo! This is good code. You have answered most of my questions indirectly. However after a lot of reading through the internet I have decided to go with having single thread per user for scalability. So the only change ill do here is omit Enumerable.Range.Select loop and add tasks dynamically on user login. +1 on NonThrowingDelay implementation – Mahesh P. Apr 24 '20 at 10:06
  • 1
    Regarding you main question, personally I think I would prefer a single task that does the job for all users, instead of spawning so many tasks (one per user). Although it's possible, it seems a bit chaotic to me. If something goes wrong, suddenly I would have thousands (or millions) of tasks completing with exceptions. It would be a logging nightmare! – Theodor Zoulias Apr 24 '20 at 10:14