1

TL;DR

When running a GCP cloudbuild that runs multiple other builds, is there a command, best practice, etc. to use if one of the sub-builds fails and the other builds need to be cancelled if incomplete or rolled back if complete?

Full Question

Consider the below directory for a Monorepo of multiple Google App Engine services:

repo-dir
├── cloudbuild-main.yaml
├── deploy-services.sh
├── service-diff.txt
├── service1
│   ├── cloudbuild.yaml
|   ├── app.yaml
│   └── ...
├── service2
│   ├── cloudbuild.yaml
|   ├── app.yaml
│   └── ...
└── ...

cloudbuild-main.yaml:

steps:
  - name: 'gcr.io/cloud-builders/gcloud'
    entrypoint: /bin/bash
    args: [ "./top-level-script.sh" ]

deploy-services.sh

while read service
do 

    config="${service}/cloudbuild.yaml"
    if [[ ! -f "${config}" ]]; then
        echo "no such file"
        continue
    fi

    gcloud builds submit $service--config=${config} &

done < ./service-diff.txt

wait

# Code to split traffic to all services deployed in above loop

service-diff.txt

This file contains a list of all service directory names which should be built and deployed. It is created by a previous step.

service1
service2

serviceN/cloudbuild.yaml

steps:
  - name: gcr.io/cloud-builders/gcloud
    args: [ "app", "deploy", "app.yaml", "--no-promote" ]

Submitting cloudbuild-main.yaml will perform the following:

  • Each service listed in service-diff.txt will have its individual cloudbuild.yaml submitted as an asynchronous background task.
  • The builder will wait for each of these to finish
  • When all builds are complete, additional code (omitted above) will set traffic splits to the new versions

This works as intended when all builds are successful, but if one were to fail, I want some way to cancel any background builds which are in progress and delete any versions which have already been deployed. I have attempted a messy solution using signal traps, but it does not consistently work and I have a feeling there is a better way all together. Is there a better way I can achieve this?

1 Answers1

1

You can use the timeout field for a build to specify the amount of time that the build must be allowed to run, to second granularity. If this time elapses, work on the build will cease and the build status will be TIMEOUT. See the full information of a Build configuration in the documentation.

You can also use gcloud builds cancel command to cancel an ongoing build and the gcloud app versions delete command to delete a specified version which has already been deployed. You cannot delete a version of a service that is currently receiving traffic.check this document1 & document2

As mentioned in this stackoverflow link by Emmanuel

The set -e flag should make the script to exit if any of the commands performed has an error, however you can also check the output of a command by using the $? variable, for example you can include the next lines:

echo "Building $d ... "
    (
        gcloud builds submit . --config=${config} $*
        if [ $? == 1 ]; then #Check the status of the last command
           echo "There was an error while building $d, exiting"
           exit 1
        fi
    ) &

So if there was an error the script will exit and give an status of 1 (error)

You can also check this stackoverflow link for more information.

Sathi Aiswarya
  • 2,068
  • 2
  • 11
  • Please take a look at [*How do I format my posts using Markdown or HTML?*](https://stackoverflow.com/help/formatting). How did you end up with that HTML? Did you somehow generate this answer? – 0stone0 May 01 '23 at 11:17