4

Hi

I am a novice in Spring Batch world and last days I've spent time watching Michael Minella's youtube video, read some documentation and successfully run some demo projects I found on the internet. I think Spring Batch is a hot candidate for our needs. But here is our story.

I am working in a company that developed their own scheduling and batch framework, for more than a decade ago, for their business department. The framework is capable of running DB stored procs, DB functions and dynamic SQLs. Needless to say it is very challenging to maintain it since too many people with various development skills did the coding and they don't work here anymore. Our framework may handle jobs and steps to run sequentially as well as async (as Spring Batch). We have also a Job Repository where we store whole job definitions (users create new jobs via GUI), job instances with its context (in case the server goes down, when server is up it will resume running a job). My questions are following:

  1. Can we create new Spring Batch jobs dynamically (either via XML og code) and via standard SB interfaces store them to the JobRepository DB?

  2. Today, at certain time period, we have up to hundred of job executions simultaneously. They are also reusing a connection pool to the DB. Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.

  3. I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?

  4. Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?


Kind regards

Toto

Community
  • 1
  • 1
BTalker
  • 67
  • 9

2 Answers2

2

You have a lot of material here to cover, so let me respond one point at a time:

Can we create new Spring Batch jobs dynamically (either via XML or code) and via standard SB interfaces store them to the JobRepository DB?

Can you generate a job definition dynamically? Yes. We do it in Spring XD with regards to the job orchestration piece (the composed job DSL is used to generate an XML file for example.

Does Spring Batch provide facilities to do this? No. You'd have to code it yourself.

Also note that you'd have to store the definition in your own table (the schema defined by Spring Batch doesn't have a table for this).

Today, at certain time period, we have up to hundred of job executions simultaneously. They are also reusing a connection pool to the DB. Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.

You can use parent/child context configurations to reuse beans including a DataSource. Define the DataSource in the parent and then the jobs that depend on it in child contexts.

I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?

This is really an orchestration concern. Spring Batch, by design, does not address the orchestration of jobs into consideration. This allows you to orchestrate them how you want.

The way I'd recommend handling this is via Spring XD or (depending on your timelines) Spring Cloud Data Flow. These tools provide orchestration capabilities including the redeployment of a job if it goes down. That being said, it won't restart a job that was running if it fails because that typically requires some form of human decision based on use case. However, Spring XD currently (and Spring Cloud Data Flow will) have the capabilities to implement something like this in a pretty straight forward way.

Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?

In cases like this, I'd start to question how your job is configured. You can use a JobExecutionDecider to decide if a step should be executed or not if it still makes sense.

All things considered, while you can accomplish most of what you're looking for with Spring Batch, using something like Spring XD or Spring Cloud Data Flow will make your life a lot easier.

Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • Thx for the answers and tips. Just a quick follow up question, is it ok to have hundred job executions simultaneously, ie. having so many ApplicationContexts (never been working with more than one at a time) regarding the performance? – BTalker Feb 18 '16 at 20:55
  • In a single JVM or multiple? – Michael Minella Feb 18 '16 at 21:08
  • Single JVM. One more thing I wanted to ask. Does Spring Batch execute jobs async by default? I see it does when I use org.springframework.core.task.SimpleAsyncTaskExecutor, ex: ` ` – BTalker Feb 18 '16 at 21:19
  • Single JVM...it would depend on many things. I'd say try and tune accordingly. As for async job execution, by default job execution is synchronous but as you point out, you can configure it to be async if you wanted. – Michael Minella Feb 18 '16 at 23:20
  • 1
    Thx a lot Michael. Btw, good job on your youtube videos and your book "Pro Spring Batch". – BTalker Feb 19 '16 at 08:18
1

Can we create new Spring Batch jobs dynamically (either via XML og code) and via standard SB interfaces store them the JobRepository DB?

It is easy to use StepBuilderFactory, FlowBuilder etc. to programatically build the Spring Batch artifacts. You'll probably want to back those artifacts with Spring Beans (to get nice facilities like the step/job spring scopes, injection and so on) and for that you can use prototype, execution scoped and job scoped beans, or even use facilities such as BeanDefinitionBuilder to dynamically create beans.

Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.

The GenericApplicationContextFactory creates a child application context. You can have the "global" beans in the parent application context.

I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?

Yes, but not that easily.

Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?

A JobExecutionDecider will likely be the best option there.

Community
  • 1
  • 1
Artefacto
  • 96,375
  • 17
  • 202
  • 225
  • You mention it is easy to create jobs dynamically by using FlowBuilder and StepBuilderFactory. Can you maybe provide a code sample? What about second part of storing job definitions in the JobRepository (i cannot see any table containing those)? – BTalker Feb 18 '16 at 12:09
  • Job definitions are not stored in the database. As for the usage of `StepBuilderFactory`, `FlowBuilder` look at the documentation. For a very basic idea, look [here](https://gist.github.com/cataphract/fe64ddfde81f58fe9784). – Artefacto Feb 18 '16 at 12:24
  • When using StepBuilderFactory/FlowBuilder I need to generate a job definition that can be serialized to a DB. That is achievable? Btw, in the table _italic_batch_job_execution_italic_ there is a column with NULL values called _italic_JOB_CONFIGURATION_LOCATION_italic_. By the sound of the name it seems to be perhaps a path for the job definition (XML)? – BTalker Feb 18 '16 at 12:28
  • @TotoKotunjo Looking at the source code, it seems to be used only for job restarts in JSR-352 jobs. Yes, it's a path to a XML definition. The column will be blank in your case. – Artefacto Feb 18 '16 at 12:36
  • Hmm, so JSR-352 supports storing paths to job definitions but SB implementation doesn't? Just to repeat my other question regarding StepBuilderFactory/FlowBuilder, will they be able to generate say job definition as XML (as specified by spring-batch.xsd) so that it can be stored in a custom table and later on executed by SB? – BTalker Feb 18 '16 at 12:50
  • @TotoKotunjo If you just generate a definition using the Java builders you won't have a XML file anyway. And no, you can't generate an XML job definition from in-memory representation of the job. But why would you want to do that? From what I gather, you want to use your current job repository and generate Spring Batch `Job`s programatically from that. No point going through an intermediate representation (Spring XML files that also use the definitions in `spring-batch-X.xsd`). – Artefacto Feb 18 '16 at 14:46
  • I understand your point but I would like to store the SB job definition in a custom table after a user generates it in a GUI. Current job definitions we create using our own framework will be deprecated, ie. we plan to go all way with SB. – BTalker Feb 18 '16 at 15:56
  • thx a lot on your replies, seems like you have deep knowledge regarding Spring Batch. When I gather enough points to vote up on comments, I will be back on this page and do that. – BTalker Feb 19 '16 at 08:21