1

Using JSR-352 batch job along with Java EE, I'm trying to process items on chunk from a source in partitions. On retriable exception I want to be able to return to a past checkpoint, so I could get items already read from the source.

The nature of the source is such that in parallel environment I cannot require the same chunk of items twice. The only feasible way to be able to get the exact same items when reading twice is by having to restart the whole job.

I need to write a generic ItemReader which can manage sources of such a kind (so it may be reusable). This basically means that want to find nice and clear design/implementation of such a reader.

To achieve the required behavior of ItemReader to process the source, what I currently do is getting the items in the beginning of the readItem() if they have not been fetched for the current chunk, and then iterate one by one through them. In order to manage retriable exceptions I'm trying to use the checkpoint properties of the ItemReader.

The problem I'm facing is that the behavior of checkpoints is such that they are loaded in open(...) method, before readItem() and saved only after the chunk has been successful. This results in a problem with saving all the items of the chunk into a valid checkpoint before I must actually retry the chunk in case of an retriable exception.

My question is there a way to make augment the behavior of checkpoints, so they are saved after the initial readItem(), or do you happen to know any other nice and clear strategy, without the usage of additional listeners, userTransientData which would make the reader hard to integrate into other batch job steps with the same read behavior?

  • Hi, I'm not seeing though what your problem is with making use of checkpoints together with retryable exceptions. I see you put a good amount of thought into the question but you might need some code snippets too. I'm guessing you're using retry with rollback (retry without rollback is much simpler, the reader/processor/writer just gets called again). In retry with rollback you would have to start the chunk over again. Is the problem that you were trying to resume from within the chunk? You are avoiding having to restart the job with retry with rollback but you do have to redo the chunk. – Scott Kurz Jun 07 '23 at 01:50

0 Answers0