0

I'm wondering for a batch distributed job I need to run. Is there a way in K8S if I use a Job/Stateful Set or whatever, a way for the pod itself(via ENV var or whatever) to know its 1 of X pods run for this job?

I need to chunk up some data and have each process fetch the stuff it needs.

--

I guess the statefulset hostname setting is one way of doing it. Is there a better option?

Tom Barber
  • 179
  • 10

2 Answers2

2

This is planned but not yet implemented that I know of. You probably want to look into higher order layers like Argo Workflows or Airflow instead for now.

coderanger
  • 52,400
  • 4
  • 52
  • 75
  • Got pointed to Argo earlier today for something completely different. Thanks, I'll check it out. – Tom Barber Apr 01 '20 at 19:11
  • Thanks for sharing, this is good to know. Do you have a link to a feature request or other documents about this planned feature? – TJ Zimmerman Apr 01 '20 at 19:17
  • It's mentioned in https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#parallel-jobs and still chattered about every now and then by sig-api-machinery. – coderanger Apr 01 '20 at 20:02
0

You could write some infrastructure as code using Ansible that will perform the following tasks in order:

  1. kubectl create -f jobs.yml
  2. kubectl wait --for=condition=complete job/job1
  3. kubectl wait --for=condition=complete job/job2
  4. kubectl wait --for=condition=complete job/job3
  5. kubectl create -f pod.yml

kubectl wait can be used in situations like this to halt progress until an action has been performed. In this case, a job has completed its run.

Here is a similar question that someone asked on StackOverflow before.

TJ Zimmerman
  • 3,100
  • 25
  • 39