0

Use case:

  1. A one-time read of data set X (from database) into a Collection C. [Collection size could be say 5000]
  2. Use Collection C to process/enrich items in a Spring Batch Step (say enrichStep)

If C is much greater than what can be passed via ExecutionContext, how can we make it available in the ItemProcessor of the enrichStep?

ram
  • 747
  • 2
  • 11
  • 34
  • Why can't use chunk processing? Do you need full collection items in enrich step? – Luca Basso Ricci Jan 07 '14 at 15:28
  • Chunk processing is being done for the enrichStep. The Collection C may just hold about 5000 objects. I think we wouldn't want to load the Collection C from database for each chunk execution. – ram Jan 07 '14 at 15:31

1 Answers1

1

In your enrichStep add a StepExecutionListener.beforeStep and load your huge collection in a HugeCollectionBeanHolder bean.
In this way you will load collection only once (when step start or re-start) and without persist it into execution context. In your enrich processor wire the HugeCollectionBeanHolder to access huge collection.

class HugeCollectionBeanHolder {
 Collection<Item> hudeCollection;

 void setHugeCollection(Collection<Item> c) { this.hugeCollection = c;}
 Collection<Item> getHugeCollection() { return this.hugeCollection;}
}

class MyProcessor implements ItemProcessor<Input,Output> {
 HugeCollectionBeanHolder hcbh;

 void setHugeCollectionBeanHolder(HugeCollectionBeanHolder bean) { this.hcbh = bean;}

 // other methods...
}

You can also look at Spring Batch: what is the best way to use, the data retrieved in one TaskletStep, in the processing of another step

Community
  • 1
  • 1
Luca Basso Ricci
  • 17,829
  • 2
  • 47
  • 69