Ansible should only print 'FAILED' after completion of all retries in until loop, not for every iteration

Question

I have an Ansible task almost identical to the top answer here: Ansible playbook wait until all pods running

- name: Wait for all control-plane pods become created
  shell: "kubectl get po --namespace=kube-system --selector tier=control-plane --output=jsonpath='{.items[*].metadata.name}'"
  register: control_plane_pods_created
  until: item in control_plane_pods_created.stdout
  retries: 10
  delay: 30
  with_items:
    - etcd
    - kube-apiserver
    - kube-controller-manager
    - kube-scheduler

- name: Wait for control-plane pods become ready
  shell: "kubectl wait --namespace=kube-system --for=condition=Ready pods --selector tier=control-plane --timeout=600s"
  register: control_plane_pods_ready

- debug: var=control_plane_pods_ready.stdout_lines

As shown in his example it prints 'FAILED' 3 times:

TASK [Wait for all control-plane pods become created] ******************************
FAILED - RETRYING: Wait all control-plane pods become created (10 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (9 retries left).
FAILED - RETRYING: Wait all control-plane pods become created (8 retries left).
changed: [localhost -> localhost] => (item=etcd)
changed: [localhost -> localhost] => (item=kube-apiserver)
changed: [localhost -> localhost] => (item=kube-controller-manager)
changed: [localhost -> localhost] => (item=kube-scheduler)

TASK [Wait for control-plane pods become ready] ********************************
changed: [localhost -> localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "control_plane_pods_ready.stdout_lines": [
        "pod/etcd-localhost.localdomain condition met", 
        "pod/kube-apiserver-localhost.localdomain condition met", 
        "pod/kube-controller-manager-localhost.localdomain condition met", 
        "pod/kube-scheduler-localhost.localdomain condition met"
    ]    
}

For my implementation, the loop fails more than 3 times more like 20 times... so it clogs up my logs... but this is expected behaviour.

So how can I only print 'FAILED' once all the retries have been used up?

I hope my question makes sense, Thanks

U880D · Answer 1 · 2022-10-11T20:04:00.397

How can I only print 'FAILED - RETRYING' once all the retries have been used up?

I understand that you are referencing to until retries and like to Retrying a task until a condition is met and the message FAILED belongs to the loop and not the final task result.

Running a short fail test

---
- hosts: localhost
  become: false
  gather_facts: false

  tasks:

  - name: Show fail test
    shell:
      cmd: exit 1 # to fail
    register: result
    # inner loop
    until: result.rc == 0 # which will never happen
    retries: 3 # times therefore
    delay: 1 # second
    # outer
    loop: [1, 2, 3] # times over 
    failed_when: item == 3 and result.rc != 0 # on last outer loop run only
    no_log: true # for outer loop content

resulting into an output of

PLAY [localhost] **********************************
FAILED - RETRYING: Show fail test (3 retries left).
FAILED - RETRYING: Show fail test (2 retries left).
FAILED - RETRYING: Show fail test (1 retries left).

TASK [Show fail test] *****************************
changed: [localhost] => (item=None)
FAILED - RETRYING: Show fail test (3 retries left).
FAILED - RETRYING: Show fail test (2 retries left).
FAILED - RETRYING: Show fail test (1 retries left).
changed: [localhost] => (item=None)
FAILED - RETRYING: Show fail test (3 retries left).
FAILED - RETRYING: Show fail test (2 retries left).
FAILED - RETRYING: Show fail test (1 retries left).
failed: [localhost] (item=None) => changed=true
  attempts: 3
  censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result'
fatal: [localhost]: FAILED! => changed=true
  censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result'

it seems that it is not possible to suppress the interim message of Ansible's until retries loop on playbook level, neither with Defining failure nor by Protecting sensitive data with no_log or Limiting loop output.

... but this is expected behaviour.

Right, unless you are not addressing Callback plugins and Setting a (other) callback plugin for ansible-playbook or Developing (own) Callback plugin the message will remain.

Similar Q&A

How to change the interim message of Ansible's until retries loop?

Further Information

ansible/plugins/callback/default.py

Possible Solution

Ansible Issue #32584
diy callback – Customize the output

Ok thats what I expected :/ seems like an oversight. Thanks for the fantastic answer though! Any idea how to get around this? — Josh, Oct 11 '22 at 13:13
@Josh, unless you are not addressing Callback plugins and Setting a (other) callback plugin for ansible-playbook or Developing (own) Callback plugin the message will remain. — U880D, Oct 11 '22 at 13:14
I ended up writing my own logic in a custom shell script, it was just easier for me to do it that way — Josh, Nov 15 '22 at 23:21
@Josh, right, it is also possible to [execute a shell script](https://stackoverflow.com/a/73594837/6771046) and transfer it later into an own Ansible module. By doing this the loop logic will be within in the script or module resulting into a changed behavior and expected output. — U880D, Nov 16 '22 at 05:49

Ansible should only print 'FAILED' after completion of all retries in until loop, not for every iteration

1 Answers1