9

I'm trying to make a Spring Batch and I have no experience with it.

Is it possible to pass information from each batch step or must they be completely independent?

For example if I have

   <batch:step id="getSQLs" next="runSQLs">
        <batch:tasklet transaction-manager="TransactionManager"
            ref="runGetSQLs" />
    </batch:step>

    <batch:step id="runSQLs">
        <batch:tasklet transaction-manager="TransactionManager"
            ref="runRunSQLs" />
    </batch:step>

And getSQLs triggers a bean which executes a class which generates a List of type String. Is it possible to reference that list for the bean triggered by runSQLs? ("triggered" may not be the right term but I think you know what I mean)

UPDATE: So getSQLs step triggers this bean:

<bean id="runGetSQLs" class="myTask"
    scope="step">
    <property name="filePath" value="C:\Users\username\Desktop\sample.txt" />
</bean>

which triggers myTask class which executes this method:

  @Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {

    ExecutionContext stepContext = this.stepExecution.getExecutionContext();
    stepContext.put("theListKey", sourceQueries);

    return RepeatStatus.FINISHED;
}

Do I need to somehow pass stepExecution to the execute method?

user2665166
  • 441
  • 4
  • 8
  • 17

3 Answers3

14

Spring Batch supports pushing data to future job steps, and this can be done through the ExecutionContext, more precisely the JobExecutionContext. Here I'm referring to example from the official documentation, as it is the ultimate reference for me:

To make the data available to future Steps, it will have to be "promoted" to the Job ExecutionContext after the step has finished. Spring Batch provides the ExecutionContextPromotionListener for this purpose.

The listener should be configured with your step, the one sharing data with future ones:

<batch:step id="getSQLs" next="runSQLs">
    <batch:tasklet transaction-manager="TransactionManager"
        ref="runGetSQLs" />
    <listeners>
        <listener>
            <beans:bean id="promotionListener" class="org.springframework.batch.core.listener.ExecutionContextPromotionListener">
                <beans:property name="keys" value="theListKey"/>
            </beans:bean>
        </listener>
    </listeners>
</batch:step>

<batch:step id="runSQLs">
    <batch:tasklet transaction-manager="TransactionManager"
        ref="runRunSQLs" />
</batch:step>

The data should be populated from your execution code block as follows:

// ...
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("theListKey", yourList);

Then in subsequent steps, this List can be retrieved with a post computation hook annotated with @BeforeStep a as follows:

@BeforeStep
public void retrieveSharedData(StepExecution stepExecution) {
    JobExecution jobExecution = stepExecution.getJobExecution();
    ExecutionContext jobContext = jobExecution.getExecutionContext();
    this.myList = jobContext.get("theListKey");
}
tmarwen
  • 15,750
  • 5
  • 43
  • 62
  • My code doesn't seem to recognize stepExecution. I imported org.springframework.batch.core.StepExecution. What am I missing here? – user2665166 Sep 21 '15 at 13:30
  • You should rather update the post with your code blocks and mention what you have done so far. – tmarwen Sep 21 '15 at 14:39
  • Updated. I also changed the step to match your example. – user2665166 Sep 21 '15 at 14:54
  • 4
    It's strange that the most common use case of passing data from one step to the next is not catered for and you have this weird promotion to job context solution. This solution doesn't work if you have a partitioner which then runs steps in parallel. – thedoctor Apr 19 '16 at 14:33
  • 1
    **@thedoctor** Also polluting DB `BATCH_JOB_EXECUTION_CONTEXT.SERIALIZED_CONTEXT` with intermediate data is not the best solution. Requirement to pass data between job steps means that you violate Batch ideology )) – gavenkoa Feb 20 '17 at 16:33
  • @gavenkoa Not passing data between jobs is oviously completely reasonable. But not passing data between steps? That doesn't make any sense, that's why they are steps, they all bring a job to completion, and there are tons of examples where you would want to continue where the last step left of. That's pretty rough.... – tfrascaroli Oct 24 '18 at 16:21
  • **@thedoctor** I mean passing big chunks of data. Imaging you collected 1 MiB from DB and printed to PDF (in memory) and need to upload to FTP. That all should be done in one step, otherwise you store 1MiB of data in BATCH_JOB_EXECUTION_CONTEXT.SERIALIZED_CONTEXT or your job won't be restartable from the middle... People tend to decompose job into steps for no reason, No need to do that just because framework has abstraction for step flow... – gavenkoa Oct 24 '18 at 18:15
  • Yes objects or parameters can be passed to other steps by putting into job execution context. Job execution context is one which is available for all steps. – Akash5288 Dec 06 '18 at 09:06
6

java config way.

Step 1 : Configure ExecutionContextPromotionListener

@Bean
    public ExecutionContextPromotionListener executionContextPromotionListener()
    {
        ExecutionContextPromotionListener executionContextPromotionListener = new ExecutionContextPromotionListener();
        executionContextPromotionListener.setKeys(new String[] {"MY_KEY"});
        return executionContextPromotionListener;   

    }

Step 2 : Configure Step with ExecutionContextPromotionListener
@Bean

    public Step myStep() {
        return stepBuilderFactory.get("myStep")
                .<POJO, POJO> chunk(1000)
                .reader(reader()                
                .processor(Processor())
                .writer(Writer()
                .listener(promotionListener())
                .build();
    }

Step 3 : Accessing data in processor

    @BeforeStep
    public void beforeStep(StepExecution stepExecution) {
         jobExecutionContext = stepExecution.getJobExecution().getExecutionContext();
         jobExecutionContext.getString("MY_KEY")
    }

Step 4 : setting data in processor

@BeforeStep
        public void beforeStep(StepExecution stepExecution) {
            stepExecution.getJobExecution().getExecutionContext().put("MY_KEY", My_value);
        }
Niraj Sonawane
  • 10,225
  • 10
  • 75
  • 104
4

I recommend to think twice in case you want to use ExecutionContext to pass information between steps. Usually it means the Job is not designed perfectly. The main idea of Spring Batch is to process HUGE amount of data. ExecutionContext used for storing information about progress of a Job/Step to reduce unnecessary work in case of failure. It is by design you can't put big data into ExectionContext. After completion of a step, you should have your information in reliably readable form - File, DB, etc. This data can be used on next steps as input. For simple jobs I would recommend to use only Job Parameters as information source.

In your case "runGetSQLs" doesn't look like a good candidate for a Step, but if you want you can implement it as a Spring bean and autowire in "runRunSQLs" step (which again is arguably good candidate for a Step). Based on your naming, runGetSQLs looks like ItemReader and runRunSQLs looks like ItemWriter. So they are parts of a step, not different steps. In this case you don't need to transfer information to other steps.

Pavel
  • 2,557
  • 1
  • 23
  • 19