4

We have a requirement to carry out data movement from 1 database to other and exploring spring batch for the same. User of our application selects source and target datasource along with the list of tables for which the data needs to be moved.

Need help with following:

  1. The information necessary to build a job comes at runtime from our web application - that includes datasource details and list of table names. We would like to create a new job by sending these details to the job builder module and launch it using JobLauncher. How do we write this job builder module?
  2. We may have multiple users raising data movement requests in parallel, so need a way to create multiple jobs and run them in suitable order.

We have used the Java based configuration to create a job and launch it from a web container. The configuration is as follows

@Bean
public Job loadDataJob(JobCompletionNotificationListener listener) {
    RunIdIncrementer inc = new RunIdIncrementer();
    inc.setKey(new Date().toString());
    JobBuilder builder = jobBuilderFactory.get("loadDataJob")
            .incrementer(inc)
            .listener(listener);
    SimpleJobBuilder simpleBuilder = builder.start(preExecute());
    for(String s : getTables()){
        simpleBuilder.next(etlTable(s));
    }
    simpleBuilder.next(postExecute());
    return simpleBuilder.build();
}

@Bean
@Scope("prototype")
public Step etlTable(String tableName) {
    return stepBuilderFactory.get(tableName)
            .<Map<String,Object>, Map<String,Object>> chunk(1000)
            .reader(dbDataReader(tableName))
            .processor(processor())
            .writer(dbDataWriter(tableName))
            .build();
}

Currently we have hardcoded the source and target datasource details into respective beans. The getTables() returns a list of tables (hardcoded) for which the data needs to be moved.

RestController that launches the job

    @RestController
public class MyController {
    @Autowired
    JobLauncher jobLauncher;

    @Autowired
    Job job;

    @RequestMapping("/launchjob")
    public String handle() throws Exception {
        try {
            JobParameters jobParameters = new JobParametersBuilder().addLong("time", new Date().getTime()).toJobParameters();
            jobLauncher.run(job, jobParameters);
        } catch (Exception e) {

        }

        return "Done";
    }
}
ksh
  • 43
  • 1
  • 1
  • 3

2 Answers2

5

Concerning your first question, you definitely have to use JavaConfiguration. Moreover, you shouldn't define your steps as spring beans, if you want to create a job with a dynamic number of steps (for instance a step per table you have to copy).

I've written a couple of answers to questions about how to create jobs dynamically. Have a look at them, they might be helpful

Edited
Some remarks concerning your second question:

Firstly, you are using a normal JobLauncher and I assume your instantiate the SimpleJobLauncher. This means, you can provide a job with jobparameters, as you have shown in your code above. However, the provided "job" does not have to be a "SpringBean"-instance, so you don't have to Autowire it and therefore, you can use create-methodes as I suggested in the answers to the questions mentioned above.

Secondly, if you create your Job instance for every request dynamically, there is no need to pass the whole configuration as jobparameters, since you can pass the "configuration properties" like datasource and tables to be copied directly as parameters to your "createJob" method. You could even create your DataSource-instances "on the fly", if you don't know all possible datasources in advance.

Thirdly, I would consider every request as a "single run", which cannot be "restarted". Hence, I'd just but some "meta information" into the jobparameters like user, date/time, datasource names (urls) and a list of tables to be copied. I would use this kind of information just as a kind of logging/auditing which requests where issued, but I wouldn't use the jobparameter-instances as controlparameters inside the job itself (again, you can pass the values of these parameters during the construction time of the job and steps by passing them to your create-Methods, so the structure of your job is created according to your parameters and hence, during runtime - when you could access your jobparameters - there is nothing to do based on the jobparameters).

Finally, if a request fails (meaning the jobs exits with an error) simply a new request has to be executed in order to retry, but this request would be a complete new request and not a restart of an already executed job launch (since I would add the request time to my jobparameters, every launch would be a unique launch).

Edited 2: Not creating the Job as a Bean doesn't mean to not use Autowiring. Here is an example, aus I would structure my Beans.

@Component
@EnableBatchProcessing
@Import() // list with imports as neede
public class JobCreatorComponent {

  @Autowire
  private StepBuilderFactory stepBuilder;

  @Autowire
  private JobBuilderFactory jobBuilder;

  public Job createJob(all the parameters you need) {
     return jobBuilder.get(). ....
  }
}

@RestController
@Import(JobCreatorComponent.class)
public class MyController {
    @Autowired
    JobLauncher jobLauncher;

    @Autowired
    JobCreatorComponent jobCreator;

    @RequestMapping("/launchjob")
    public String handle() throws Exception {
        try {
            Job job = jobCreator.createJob(... params ...);
            JobParameters jobParameters = new JobParametersBuilder().addLong("time", new Date().getTime()).toJobParameters();
            jobLauncher.run(job, jobParameters);
        } catch (Exception e) {

        }

        return "Done";
    }
}
Hansjoerg Wingeier
  • 4,274
  • 4
  • 17
  • 25
  • Some of the posts in the links you shared have helped us resolve other problems. However, the main concern still remains - how do use spring batch as a service that can create and run jobs as and when user requests for data movement? Sending jobparameters while launching the job seems to have constraints on sending complex objects (datasource details + data movement request details in our case). – ksh Jun 27 '17 at 04:15
  • As per your suggestion, we are not creating jobs as spring beans anymore and it works well for our use case. Note that, with no autowiring and no annotations (like @EnableBatchProcessing) used, few manual configurations needed to be done for JobRepository and Job/Step builder factory. Only concern now is if we are losing out on anything because of this change. Will update here once done with analyzing the same. – ksh Jun 29 '17 at 05:45
  • I'm not sure, if I understood your remark correctly. But I did not mean not to use autowiring at all. You still can use @EnableBatchProcessing or using the StepBuilderFactory and JobBuilderFactory. Please have a look at the example I added to my answer. – Hansjoerg Wingeier Jun 29 '17 at 09:45
0

by using @JobScope on itemreader no need to do things manually at run time just have to annoted your respective reader with @Jobscope, on each interaction with controller you will get fresh record processing.

This is type of job on demand where you can execute the job for goals like do the db migration or get the specific reporting like that.