0

I have a pwsh script that I am triggering from a GitHub actions workflow. It is supposed to delete resource groups that are found in a subscription (output of previous step is the required resource groups). The problem I find is, Azure is sometimes too fast for itself, so it tries to delete a resource group that has already been deleted. I need a way to prevent the step from failing (essentially not wanting that the az cli sets the error code to 1), for this specific occurence.

This is the code I tried (since try catch doesn't work with az cli, had to do !$?). I basically write the error to txt (I am aware it's not the best solution) and then check for the specific error that occurs. It all works fine, but the $ErrorActionPreference = "Continue" doesn't help in preventing the az cli failing the step.

param(
    [Parameter()]
    [string]$rgName
)
$ErrorActionPreference = "Stop"
$retries = 5
$sleepInSec = 30
for ($i = 1; $i -le $retries; $i++) {
    az group delete --yes --name $rgName 2>fail.txt
    if (!$?) {
        $errorMsg = Get-Content "fail.txt"
        if ($errorMsg -like "*ERROR: (ResourceGroupNotFound)*") {
            Write-Host "RG already deleted or never present. Skipping..."
            $ErrorActionPreference = "Continue"
            break
        }
        elseif ($i -eq $retries) {
            Write-Error "Maximum retries reached ($retries). Need manual inspection. Error: $errorMsg"
            $ErrorActionPreference = "Stop"
            exit 1
        } else {
            Write-Host "Failed $i. run. Waiting for $sleepInSec seconds before retrying."
            Start-Sleep -Seconds $sleepInSec
            continue
        }
    }
    break
}

Any ideas on how to not fail the step on certain errors from az cli?

kirkec
  • 13
  • 2
  • See [`jobs..continue-on-error`](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idcontinue-on-error). – Azeem Mar 24 '23 at 10:17
  • I've used that previously but it doesn't work in this context. I need the step to fail in general if az cli fails (which happens correctly without any additional code). I only want to prevent it with certain errors of az cli (when it's stating *ERROR: (ResourceGroupNotFound)*"). – kirkec Mar 24 '23 at 11:29
  • Right. You're above using above code directly under `run`? – Azeem Mar 24 '23 at 11:39
  • You can again test for the group's existence before deleting it. Isn't that feasible? Example: https://stackoverflow.com/a/56544391/7670262. Otherwise, IMO, you seem to be doing the right thing i.e. dumping the output to a file and then making decisions based on its contents. However, you could further refine this by using [`az group delete`](https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest#az-group-delete)'s `--no-wait` option and [`az group wait`](https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest#az-group-wait). – Azeem Mar 24 '23 at 11:48
  • 1
    Implicitly yes. I just moved the code from the actual yaml to not have messy pipleines. ` - name: Delete selected RG shell: pwsh run: ./.github/scripts/delete-rg.ps1 -rgName ${{ matrix.target }} ` – kirkec Mar 24 '23 at 11:49
  • I guess I wanted to see if there is a solution so I don't need to ping azure another time. If there are no better options, I will resort to that – kirkec Mar 24 '23 at 11:52
  • Right. Another thing that you can use with [`az group delete`](https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest#az-group-delete) is `--query` parameter with JMESPath. See its [Global Parameters](https://learn.microsoft.com/en-us/cli/azure/group?view=azure-cli-latest#az-group-delete). – Azeem Mar 24 '23 at 12:00

0 Answers0