I have a spring batch step that reads from a file, processes the records and writes to a file using chuck processing. The file is expected to have millions of large records. I read that Spring holds [chunk-size] number of processed records in memory before passing it to the writer.
To optimize memory usage I kept the [chunk-size] small. This however increases the number of updates the step does to the BATCH_STEP_EXECUTION metadata table to update the read and commit count.
Given I am reading and writing to local files, the updates to a remote database server are relatively expensive. If I increase the [chunk-size], the memory usage goes up.
The commit-frequency doesn't really matter much to writing local files so it is the metadata updates that are a problem for me. The step is restartable so technically I have no need to log the intermediate commit counts.
I could just use a map or in memory database for JobRepository but I need the other information such as the start/end times persisted and also this concern is only for a single step.
Are there any configuration parameters that could turn off the intermediate commit count updates to the job repository or say write out the chunk records from memory to storage earlier only committing at chunk-size / commit-frequency. Basically I am looking if there is something that separates chunk-size from commit-frequency.