1

We would like to migrate the scheduling and sequence control of some Kettle import jobs from a proprietary implementation to a Spring Batch flavour, good practice implementation.

I intend to use Spring Cloud Data Flow (SCDF) server to implement and run a configurable sequence of the existing external import jobs.

The SCDF console Task editor UI seems promising to assemble a flow. So one Task wraps one Spring Batch, which in a single step only executes a Tasklet starting and polling the Carte REST API. Does this make sense so far?

Would you suggest a better implementation?

Constraints and Requirements:

  • The external Kettle jobs are triggered and polled using Carte REST API. Actually, it's one single Kettle job implementation, called with individual parameters for each entity to be imported.
  • There is a configurable, directed graph of import jobs for several entities, some of them being dependent on a correct import of the previous entity type. (e.g. Department, then Employee, then Role assignments...)
  • With the upcoming implementation, we would like to get
    • monitoring and controlling (start, abort, pause, resume)
    • restartability
    • easy reconfigurability of the sequence in production (possibly by GUI, or external editor)
    • possibly some reporting and statistics.

As my current understanding, this could be achieved by using Spring Cloud Data Flow (SCDF) server, and some Task / Batch implementation / combination.

Correct me if I'm wrong, but a single Spring Batch job with its hardwired flow seems not very suitable to me. Or is there an easy way to edit and redeploy a Spring Batch with changed flow in production? I couldn't find anything, not even an easy to use editor for the XML representation of a batch.

leo
  • 3,528
  • 3
  • 20
  • 19
  • related to @MichaelMinella 's post https://stackoverflow.com/questions/44386991/spring-batch-flow-job-vs-spring-composed-task/44392691#44392691 though I would like to ask for a more concrete implementation advice with this more specific question. Thanks. – leo Dec 11 '19 at 14:20

1 Answers1

2

Yes, I believe you can achieve your design goals using Spring Cloud Data Flow along with the Spring Cloud Task/Spring Batch.

The flow of multiple Spring Batch Jobs (using the Composed Task) can be managed using Spring Cloud Data Flow as you pointed from the other SO thread.

The external Kettle jobs are triggered and polled using Carte REST API. Actually, it's one single Kettle job implementation, called with individual parameters for each entity to be imported.

There is a configurable, directed graph of import jobs for several entities, some of them being dependent on a correct import of the previous entity type. (e.g. Department, then Employee, then Role assignments...)

Again, both the above can be managed as a Composed Task (with the composed task consisting of a regular task as well as Spring Batch based applications).

You can manage the parameters passed to each task/batch upon invocation via batch job parameters or task/batch application properties or simply command-line arguments.

With the upcoming implementation, we would like to get monitoring and controlling (start, abort, pause, resume) restartability easy reconfigurability of the sequence in production (possibly by GUI, or external editor) possibly some reporting and statistics.

Spring Cloud Data Flow helps you achieve these goads. You can visit the Task Developer Guide and the Task Monitoring Guide for more info.

You can also check the Batch developer guide from the site as well.

Community
  • 1
  • 1
Ilayaperumal Gopinathan
  • 4,099
  • 1
  • 13
  • 12