We are currently using Spring batch - remote chunking for scaling batch process . Thinking of using Cloud data flow but would like to know if based on load Slaves can be dynamically provisioned? we are deployed in Google Cloud and hence want to think of using Spring Cloud data flow support for kubernetes as well if Cloud data flow would fit our needs ?
1 Answers
When using the batch extensions of Spring Cloud Task (specifically the DeployerPartitionHandler
), workers are dynamically launched as needed. That PartitionHandler
allows you to configure a maxiumum number of workers, then it will process each partition as an independent worker up to that max (processing the rest of the partitions as others finish up). The "dynamic" aspect is really controlled by the number of partitions returned by the Partitioner
. The more partitions returned means the more workers launched.
You can see a simple example configured to use CloudFoundry in this repo: https://github.com/mminella/S3JDBC The main difference between it and what you'd need is that you'd swap out the CloudFoundryTaskLauncher
for a KubernetesTaskLauncher
and it's appropriate configuration.

- 20,843
- 4
- 55
- 67
-
Thanks Michael.It helps .Currently we use MessagingPartitionHandler since we are sending data via messaging middleware ActiveMQ.If that's the case then should we use multiple Partition Handlers- (i.e) one for sending messaging and other for provisioning worker node ? – Raghavan Narasimhan Feb 08 '17 at 09:23
-
No. The `DeployerPartitionHandler` is responsible for launching the workers as well as providing the metadata. It uses Boot properties to pass the values that would be sent over ActiveMQ so there is no need for messaging middleware when using this approach. – Michael Minella Feb 08 '17 at 15:57