TL;DR
When running a GCP cloudbuild that runs multiple other builds, is there a command, best practice, etc. to use if one of the sub-builds fails and the other builds need to be cancelled if incomplete or rolled back if complete?
Full Question
Consider the below directory for a Monorepo of multiple Google App Engine services:
repo-dir
├── cloudbuild-main.yaml
├── deploy-services.sh
├── service-diff.txt
├── service1
│ ├── cloudbuild.yaml
| ├── app.yaml
│ └── ...
├── service2
│ ├── cloudbuild.yaml
| ├── app.yaml
│ └── ...
└── ...
cloudbuild-main.yaml:
steps:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: /bin/bash
args: [ "./top-level-script.sh" ]
deploy-services.sh
while read service
do
config="${service}/cloudbuild.yaml"
if [[ ! -f "${config}" ]]; then
echo "no such file"
continue
fi
gcloud builds submit $service--config=${config} &
done < ./service-diff.txt
wait
# Code to split traffic to all services deployed in above loop
service-diff.txt
This file contains a list of all service directory names which should be built and deployed. It is created by a previous step.
service1
service2
serviceN/cloudbuild.yaml
steps:
- name: gcr.io/cloud-builders/gcloud
args: [ "app", "deploy", "app.yaml", "--no-promote" ]
Submitting cloudbuild-main.yaml
will perform the following:
- Each service listed in
service-diff.txt
will have its individualcloudbuild.yaml
submitted as an asynchronous background task. - The builder will wait for each of these to finish
- When all builds are complete, additional code (omitted above) will set traffic splits to the new versions
This works as intended when all builds are successful, but if one were to fail, I want some way to cancel any background builds which are in progress and delete any versions which have already been deployed. I have attempted a messy solution using signal traps, but it does not consistently work and I have a feeling there is a better way all together. Is there a better way I can achieve this?