What's the best way to pass a huge collection to a Spring Batch Step?

Question

Use case:

A one-time read of data set X (from database) into a Collection C. [Collection size could be say 5000]
Use Collection C to process/enrich items in a Spring Batch Step (say enrichStep)

If C is much greater than what can be passed via ExecutionContext, how can we make it available in the ItemProcessor of the enrichStep?

Why can't use chunk processing? Do you need full collection items in enrich step? — Luca Basso Ricci, Jan 07 '14 at 15:28
Chunk processing is being done for the enrichStep. The Collection C may just hold about 5000 objects. I think we wouldn't want to load the Collection C from database for each chunk execution. — ram, Jan 07 '14 at 15:31

score 1 · Accepted Answer · edited May 23 '17 at 11:56

In your enrichStep add a StepExecutionListener.beforeStep and load your huge collection in a HugeCollectionBeanHolder bean.
In this way you will load collection only once (when step start or re-start) and without persist it into execution context. In your enrich processor wire the HugeCollectionBeanHolder to access huge collection.

class HugeCollectionBeanHolder {
 Collection<Item> hudeCollection;

 void setHugeCollection(Collection<Item> c) { this.hugeCollection = c;}
 Collection<Item> getHugeCollection() { return this.hugeCollection;}
}

class MyProcessor implements ItemProcessor<Input,Output> {
 HugeCollectionBeanHolder hcbh;

 void setHugeCollectionBeanHolder(HugeCollectionBeanHolder bean) { this.hcbh = bean;}

 // other methods...
}

You can also look at Spring Batch: what is the best way to use, the data retrieved in one TaskletStep, in the processing of another step

I don't this so, but I suppose - when I answered to this question - OP use collection as a r/o collection — Luca Basso Ricci, Apr 10 '19 at 09:48

What's the best way to pass a huge collection to a Spring Batch Step?

1 Answers1

Linked