Reporting Overall Progress
When you consider that a TPL DataFlow graph has a beginning and end block and that you know how many items you posted into the graph, all you need do is track how many messages have reached the final block and compare it to the source count of messages that were posted into the head. This will allow you to report progress.
Now this works trivially if the blocks are 1:1 - that is, for any message in there is a single message out. If there is a one:many block, you will need to change your progress reporting accordingly.
Reporting Job Stage Progress
If you wish to present progress of a job as it travels throughout the graph, you will need to pass job details to each block, not just the data needed for the actual block. A job is a single task that must span all the steps 1-6 listed in your question.
So for example step 2 may require image data in order to perform alignment but it does not care about filenames; how many steps there are in the job or anything else job related. There is insufficient detail to know state about the current job or makes it difficult to lookup the original job based on the block input alone. You could refer to some external dictionary but graphs are best designed when they are isolated and deal only with data passed into each block.
So a simple example would be to change this minimal code from:
var alignmentBlock = new TransformBlock<Image, Image>(n => { ... });
...to:
var alignmentBlock = new TransformBlock<Job, Job>(x =>
{
job.Stage = Stages.Aligning;
// perform alignment here
job.Aligned = ImageAligner.Align (x.Image, ...);
// report progress
job.Stage = Stages.AlignmentComplete;
});
...and repeat the process for the other blocks.
The stage property could fire a PropertyChanged
notification or use any other form of notification pattern suitable for your UI.
Notes
Now you will notice that I introduce a Job
class that is passed as the only argument to each block. Job
contains input data for the block as well as being a container for block output.
Now this will work, but the purist in me feels that it would be better to perhaps keep job metadata separate what is TPL block input and output otherwise there is potential state damage from multiple threads.
To get around this you may want to consider using Tuple<>
and passing that into the block.
e.g.
var alignmentBlock = new TransformBlock<Tuple<Job, UnalignedImages>,
Tuple<Job, AlignedImages>>(n => { ... });