I would like some information about the data flow in a Spring Batch processing but fail to find what I am looking for on the Internet (despite some useful questions on this site).
I am trying to establish standards to use Spring Batch in our company and we are wondering how Spring Batch behaves when several processors in a step updates data on different data sources.
This question focuses on a chunked process but feel free to provide information on other modes.
From what I have seen (please correct me if I am wrong), when a line is read, it follows the whole flow (reader, processors, writer) before the next is read (as opposed to a silo-processing where reader would process all lines, send them to the processor, and so on).
In my case, several processors read data (in different databases) and updates them in the process, and finally the writer inserts data into yet another DB. For now, the JobRepository is not linked to a database, but that would be an independent one, making the thing still a bit more complex.
This model cannot be changed since the data belongs to several business areas.
How is the transaction managed in this case? Is the data committed only once the full chunk is processed? And then, is there a 2-phase commit management? How is it ensured? What development or configuration should be made in order to ensure the consistency of data?
More generally, what would your recommendations be in a similar case?