I'm working on parallel workloads where each object or Task reports it's own individual progress, and I want to report collective progress of the task as a whole.
For example, imagine I have 10 Work objects which all report individual progress. They contain 0-100 "tasks" that must be completed.
If we were to iterate linearly over each of the Work objects, we could easily report our progress and see output something like this:
Work item #1 of 10 is currently 1 of 100 tasks completed.
Work item #1 of 10 is currently 2 of 100 tasks completed.
...
Work item #10 of 10 is currently 100 of 100 tasks completed.
However, when running in parallel, the output would look something like this:
Work item #1 of 10 is currently 1 of 100 tasks completed.
Work item #4 of 10 is currently 16 of 100 tasks completed.
Work item #7 of 10 is currently 4 of 100 tasks completed.
...
Work item #10 of 10 is currently 100 of 100 tasks completed.
The problem I'm trying to solve is concatenating all progress in parallel loops such so that the output to the user is more akin to "1/1000" or "10/1000", representing the total amount of work accomplished, and updating numerator as work continues.
I would expect there's a solution or pattern that's fitting regardless of Async/Await or using the Task Asynchronous Pattern—I'm using both—and I'm hoping there is already ways to handle this in the .NET framework that I haven't discovered.
Using this simple (pseudocode) example from TAP:
Parallel.ForEach(WorkObject, wo =>
{
// Perhaps each WorkObject has a "ProgressChanged" delegate that fires progress notifications.
wo.ProgressChanged += delegate (int currentProgress, int totalProgress)
{
ReportProgress($"Work item #{wo.ID} of {WorkObject.Count} is currently {currentProgress} of {totalProgress} tasks completed.
};
// Or perhaps using IProgress<T> or Progress?
// wo.PerformWork(/*IProgress<T> or Progress<T>, etc.*/);
});
We can iterate in parallel, and progress updates/notifications will come in as each thread completes a unit of work.
How can we effectively merge the progress of all of WorkObjects such so that we can report a more uniform "1/1000" completed?
The problem is that each WorkObject could have a varying number of "jobs" to complete, and we could have a varying number of WorkObjects that need to work. If one simply concatenates the numerator and denominator from all WorkObjects as each progress notification comes in (assuming they update after each unit of work is completed), by the end of the parallel workload, the progress notification would reflect something like "1000/100,000" instead of "1000/1000".
It seems that we need a way to keep track of current progress, X, as well as total progress, Y, to form a coherent message for the user about total progress state (X of Y completed.)
Is there an existing model (in the Framework or otherwise) to do this?
My current thought is to create a data structure recording the Thread ID of each thread executing in parallel, and then tracking each thread's progress in that data structure (as an X/Y) value, and finally as each thread posts a progress update, iterating over the data structure to sum X/Y from each thread to generate a total "X/Y" to display to the user.
But surely this problem is being faced by developers every day—so there must be another way?