My project has 20+ batch jobs that are built with Spring Batch and have been in Production for a couple of years. We are currently in the process of migrating them to individual Spring Boot applications that are built with Spring Batch and Spring Cloud Task. These will then be deployed as Tasks in Spring Cloud Dataflow and deployed to PCF.
Given that these jobs (which were only using Spring Batch) were already in Production, the Batch Repository tables contain tons of data of all their past executions. When we deploy the newly migrated jobs (which introduce the Task tables), the data between the Batch and Task tables will not match, since the Task tables will be newly created and thus empty. Although this doesn't prevent us from running new job executions, it does prevent us from using the "Jobs" tab in Spring Cloud Dataflow. This is because in order to load the page, it queries the TASK_TASK_BATCH table trying to match every job_execution_id
with a task_execution_id
. This throws the infamous NullPointerException mentioned in other posts (Dataflow Tasks are not working with Spring Batch), when there is no such record for every job_execution_id
.
So my question is, what is the proper way to address this discrepancy, for any team who has already been using Spring Batch, and are migrating the same jobs to also use Spring Cloud Task? Is there any process provided by Spring to address this? Ideally we want to keep this data of past batch job executions in the Batch Repository tables, we don't want to delete it. Would we then have to make up 'matching' dummy data in the Task tables to get rid of this discrepancy?
Thank you.