I am trying to write a data process unit in kubernetes.
For every process unit has a quite similar workflow:
- Puller pull data from object storage and mount an
/input
volume to container - Processor run the code to process data in volume and output data to
/output
volume - Pusher push data in
/output
volume to object storage again
So every pod or job must have a container as data pusher and data puller which is mentioned in here by shared volume. But how can i make the process as pull -> process -> push sequence?
Right now I can use volume share way to communication to make it work: first I can let puller start working and let data processor wait until it find a pull-finished.txt created. Then let the pusher start working when it find a process-finished.txt created. But this may have to force the data process container FROM some image or use some specific entrypoint which is not what I want. Is there a more elegant way to make this work?