2

I'm using the Parallel class for my tasks and need to implement a progress percentage counter. Tried combining different locks but either get very slow performance or the result is like on the screenshot.

Here is an example of code that works but slow. Because I'm blocking the whole part.

Parallel.ForEach(bossUsers, StandardParallelOptions(), (user) =>
{
    logger.Trace($"Sync new user {user.PID}.");
    while (queue.Count >= MaxDegreeOfParallelism)
    {
        Thread.Sleep(1);
    }

    queue.Enqueue(user);


    CreateOrUpdateUserAsync(
            cardRepository,
            dbScope,
            user,
            logger,
            cancellationToken)
        .ConfigureAwait(false);

    lock (locker)
    {
        var current = (int)(((progress) / (decimal)bossUsers.Count) * 100);
        if (current > 0 && current % 10 == 0)
        {
            if (last != current)
                logger.Info($"Progress: {current}%");
            last = current;
        }
    }

    logger.Trace($"Sync user {user.PID} complete.");
});

And if I remove the lock, I get this many similar lines

progress

Please advise how to make a lock so that it does not critically slow down the service. Thx!

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
  • Something is incrementing `progress`. Why doesn't _it_ do the logging? – mjwills Sep 19 '21 at 13:52
  • `but either get very slow performance` How fast is it without locks? How fast is it with locks? **Be specific**. – mjwills Sep 19 '21 at 13:52
  • Yes progress is incremented inside CreateOrUpdateUserAsync method via Interlocked.Increment(ref progress) – Евгений Sep 19 '21 at 13:54
  • So why doesn't _it_ do the logging? It knows the old progress number and the new one. It knows that it just ticked from 59% to 60%. So it could do the logging _with no extra locking at all_. – mjwills Sep 19 '21 at 13:54
  • @mjwills without lock = 5s and with 22s – Евгений Sep 19 '21 at 13:59
  • Cool - and when you tried my suggestion? – mjwills Sep 19 '21 at 14:00
  • @mjwills The point is that the incoming collection is several thousand and when converting to percentages we get several times 10%.I decided to store the last value, but apparently other threads have time to read it too before it changes. – Евгений Sep 19 '21 at 14:04
  • anyway need to lock( – Евгений Sep 19 '21 at 14:07
  • As a side note, what strikes me as odd in your code is the `Thread.Sleep(1);` in a loop, and the `CreateOrUpdateUserAsync` asynchronous method that is launched in a fire-and-forget fashion. FYI the `Parallel` class is currently (.NET 5) not suitable for asynchronous work. For limiting the concurrency of asynchronous operations you can look [here](https://stackoverflow.com/questions/10806951/how-to-limit-the-amount-of-concurrent-async-i-o-operations/) or [here](https://stackoverflow.com/questions/11564506/nesting-await-in-parallel-foreach). – Theodor Zoulias Sep 19 '21 at 15:14
  • 1
    You don't need to lock. You know the size of the collection. You know the incremented progress. You can calculate the percentage from the current progress indicator and indicator - 1. If the previous values was 59% and the new one is 60% (i.e. a multiple of 10) then hooray - you need to log. **No locking is required**. – mjwills Sep 19 '21 at 23:24
  • 1
    Is `CreateOrUpdateUserAsync` I/O or CPU bound? If former then `Parallel.ForEach` is not your friend, because it was designed for CPU bound operations. If you look at the [overloads](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreach?view=net-5.0) then you can see none of them anticipates an async function. In .NET 6 there will be a [`Parallel.ForeachAsync`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreachasync?view=net-6.0), which can be used for I/O bound operations as well. – Peter Csala Sep 20 '21 at 07:09
  • @PeterCsala No my friend. It just win service directly relation with CPU. And the Parallel as a lifeline for this job. Thx for news about ForeachAsync but its more relevant for the client side i think. – Евгений Sep 20 '21 at 08:29
  • @Евгений Could you please share with us the implementation of the `CreateOrUpdateUserAsync` method? – Peter Csala Sep 20 '21 at 08:36
  • ofc. https://codeshare.io/AdDeOE – Евгений Sep 20 '21 at 10:02
  • @Евгений Sorry mate, but your provided code is I/O bound. `GetGuidByBossObject(dbScope ...`, `UpdateUserAsync(dbScope ...`, `CreateUserAsync(cardRepository, dbScope, ...` these are performing database operations, which are I/O operations. There is no CPU heavy calculation inside the `CreateOrUpdateUserAsync` according to my understanding. – Peter Csala Sep 20 '21 at 15:32
  • @PeterCsala there are no mathematical calculations. But in addition to getting data from the database, there is a hidden layer of platform operation and there is a layer of data processing with sorts, type conversions and checks. In my way, using parallelism on large collections with the option MaxDegreeOfParallelism = Environment.ProcessorCount * 4 gives me a significant performance gain. – Евгений Sep 20 '21 at 16:15
  • Exactly, if it were CPU bound than setting the level of parallelism higher than the number of core will not give any performance boost. Your network driver can handle greater number of concurrent calls than your CPU cores. – Peter Csala Sep 20 '21 at 16:19

1 Answers1

1

So my quickest solution is to block only the assignment of the variable "last". I achieve the desired result without significant performance loss. But still closely watching the log I can see how the threads are slowing down.

I do not block calculations, but only assignment and output to the log if to retrieve the remainder.

var current = (int)((progress / (decimal)syncUsers.Count) * 100);

if (current > 0 && current % 10 == 0)
{
    lock (locker) // lock here only
    {
        if (last != current) logger.Info($"Complete: {current}%");
        last = current;
    }
}
Peter Csala
  • 17,736
  • 16
  • 35
  • 75