I'd like to know if the following approach is a good way to implement a producer and consumer pattern in C# .NET 4.6.1
Description of what I want to do:
I want to read files, perform calculation on the data within and save the result. Each file has an origin (a device e.g. data logger) and depending on that origin different calculations as well as output formats should be used. The file contains different values, e.g. temperature readings of several sensors. It is important that the calculations have a state. For instance this could be the last value of the previous calculation, e.g. if I want to sum all values of one origin.
I want to parallelize the processing per origin. All files from one origin need to be processed sequentially (or more specific chronologically) and cannot be parallelized.
I think the TPL Dataflow might be an appropriate solution for this.
This is the process I came up with:
The reading would be done by an TransformBlock
. Next I would create instances of the classes performing operations on the data for each origin. They get initialized with the neccessary parameters, so that they know how to process files for their origin.
Then I would create TransformBlocks
for each created object (so basically for each origin). Each TransformBlocks
would execute a function of the corresponding object. The TransformBlock
reading the files would be linked to a BufferBlock
, which is linked to each TransformBlock
for the processing per origin. The linking would be conditional, so that only data that is meant to reach the processing TranformBlock
of an origin would be received. The output of the processing Blocks would be linked with an ActionBlock
for writing the output files.
The maxDegreeOfParallelism
is set to 1 for every Block.
Is that a viable solution? I thought about implementing this with Tasks and the BlockingCollection, but it seems this would be the easier approach.
Additional Information:
The amount of files processed may be to large in size or number to be loaded at once. Reading and writing should happen concurrent to the processing. As I/O takes time and because data needs to be collected after processing to form an output file, buffering is essential.