0

I have an Ansible Tower playbook that includes task1, task1 includes task2, task2 includes task3, and finally task3 includes task4. All that with loops. That is working, but the problem is that environment is very large and so it takes hours to go trough all items (in this case more than 10k of VM's - the end of task2).

Is there a way to speed this up? I know I can create more than one job template, i.e. one job template per vCenter, and allow concurrency, but I would like to have just one template, and do some of the tasks in parallel if possible.

I saw there is async option that can be put for some task, and then this task can be done in parallel, but when I tried that I've got message like:

 ERROR! 'async' is not a valid attribute for a TaskInclude

So, it looks like I cannot combine async with include_task. Is there some workaround for this?

Code is as following:

Main playbook:


---

    - name: Scheduled job for automatic deletion of old VM snapshots
      hosts: localhost
      gather_facts: true
      connection: local
      collections:
        - community.vmware
      vars_files:
        - vars/vars-vmware-snapshot.yml
      tasks:
      
      - debug: 
          msg: List of vCenters for which old VM snapshots will be deleted "{{ vcenters_list }}"
    
      - name: Loop through all vCenters
        include_tasks: vmware-snapshots-scheduled-delete-task1.yml
        loop: "{{ vcenters_list }}"
        loop_control:
          loop_var: vc_item


task1:

    
      - name: Current item (vCenter) from loop
        debug:
          msg: "{{ vc_item }}"
      
      - name: Gather information about datacenters for the vCenter "{{ vc_item }}"
        community.vmware.vmware_datacenter_info:
          hostname: "{{ vc_item }}"
          username: "{{ vcenter_username }}"
          password: "{{ vcenter_password }}"
          validate_certs: False
        delegate_to: localhost
        register: datacenters_information
    
      - name: List of datacenters for the vCenter "{{ vc_item }}"
        set_fact:
          datacenters_list: "{{datacenters_information.datacenter_info | map(attribute='name')}}"
    
      - name: Loop through "{{ vc_item }}" datacenters
        include_tasks: vmware-snapshots-scheduled-delete-task2.yml
        loop: "{{ datacenters_list }}"
        loop_control:
          loop_var: dc_item


task2:


   
      - name: Current item (datacenter) from loop
        debug:
          msg: "{{ dc_item }}"
      
      - name: Gather only registered virtual machines from the current datacenter "{{ dc_item }}" in "{{ vc_item }}"
        community.vmware.vmware_vm_info:
          hostname: "{{ vc_item }}"
          username: "{{ vcenter_username }}"
          password: "{{ vcenter_password }}"
          vm_type: vm
          folder: "/{{ dc_item }}/vm/"
          validate_certs: False
        delegate_to: localhost
        register: vm_information
        ignore_errors: true
          
      - name: Filter just VM names
        set_fact:
          vm_list: "{{ vm_information.virtual_machines | map(attribute='guest_name') }}"
        ignore_errors: true
    
      - name: Loop through VM's for the current datacenter "{{ dc_item }}" in "{{ vc_item }}"
        include_tasks: vmware-snapshots-scheduled-delete-task3.yml
        loop: "{{ vm_list }}"
        loop_control:
          loop_var: vm_item
        ignore_errors: true


task3


      - name: Current item (virtual machine) from loop
        debug:
          msg: "{{ vm_item }}"
    
      - name: Remove brackets from vm_item and also convert "%2f" to original "/" if there is "/" in the VM name
        set_fact:
          vm_item2: "{{ vm_item | regex_replace('\\[|\\]', '') | regex_replace('%2f', '/')  }}"
    
      - name: Gather snapshot information for the virtual machine "{{ vm_item2 }}" in "{{ vc_item }}"
        community.vmware.vmware_guest_snapshot_info:
          hostname: "{{ vc_item }}"
          username: "{{ vcenter_username }}"
          password: "{{ vcenter_password }}"
          datacenter: "{{ dc_item }}"
          folder: "/{{ dc_item }}/vm/"
          name: "{{ vm_item2 }}"
          validate_certs: False
        delegate_to: localhost
        register: snapshot_info
            
      - name: Loop through all snapshots for the virtual machine "{{ vm_item2 }}" in "{{ vc_item }}"
        include_tasks: vmware-snapshots-scheduled-delete-task4.yml
        loop: "{{ snapshot_info.guest_snapshots.snapshots }}"
        loop_control:
          loop_var: snapshot_item
        when: 
          - snapshot_info.guest_snapshots | length > 0

Any idea? Thanks.

LJS
  • 317
  • 1
  • 9
  • Does [How do I optimize performance of Ansible playbook](https://stackoverflow.com/a/73181784/6771046) or [Parallel execution of localhost tasks in Ansible](https://stackoverflow.com/a/73782713/6771046) answer your question, since it explains where the bottleneck comes from and possible solutions. – U880D Jun 18 '23 at 09:11
  • Maybe I missed it, but I haven't seen a working solution that combines include_task with loops and async. However, there is one thing that might help partly and that is forks. I haven't used forks so far, but if I'm right, I can put list of vCenters into inventory and then with modifying forks to let's say 10, it should process all vCenters in the parallel, instead one by one. If that works that is for sure better then doing one vCenter at the time, but the main time problem here is looping VM's and this list is very large and dynamic. – LJS Jun 18 '23 at 09:51
  • Indeed, include with loop and async will be technically not be possible. Right, divide and conquer, split into simply parts and parallelize them will improve performance. Also gather facts and cache will help to improve. – U880D Jun 18 '23 at 09:55
  • Gather_facts - I have to leave it with true because in the task4 I am gathering current time (from localhost - Ansible server). As for caching, hm, based on what I'm doing, not sure what can be good task for caching. – LJS Jun 18 '23 at 14:16

1 Answers1

3

Ansible executes one task at a time for each host, so the best way to parallelize tasks that touch multiple things is to actually add them as hosts, rather than loop over them on localhost. Instead of splitting up a single play targeting localhost into multiple task files that are looped over, use multiple plays:

- name: Scheduled job for automatic deletion of old VM snapshots
  hosts: localhost
  gather_facts: true
  vars_files:
    - vars/vars-vmware-snapshot.yml
  tasks:
    - debug:
        msg: List of vCenters for which old VM snapshots will be deleted "{{ vcenters_list }}"

    - name: Add vCenters to inventory
      ansible.builtin.add_host:
        name: "{{ item }}"
        groups: vcenter_targets
        vcenter_username: "{{ vcenter_username }}"
        vcenter_password: "{{ vcenter_password }}"
      loop: "{{ vcenters_list }}"

- hosts: vcenter_targets
  gather_facts: false
  tasks:
    - name: Gather information about datacenters for the vCenter
      community.vmware.vmware_datacenter_info:
        hostname: "{{ inventory_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: false
      delegate_to: localhost
      register: dc_results

    - name: Gather registered virtual machines from datacenters
      community.vmware.vmware_vm_info:
        vm_type: vm
        folder: "/{{ item }}/vm/"
        hostname: "{{ inventory_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: false
      loop: "{{ dc_results.datacenter_information | map(attribute='name') }}"
      delegate_to: localhost
      register: vm_results

    - name: Add virtual machines to inventory
      ansible.builtin.add_host:
        name: "{{ item.1.guest_name | replace('[', '') | replace(']', '') | replace('%2f', '/') }} ({{ item.1.uuid }})"
        uuid: "{{ item.1.uuid }}"
        groups: vm_targets
        vcenter_username: "{{ vcenter_username }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_datacenter: "{{ item.0.item }}"
      loop: "{{ vm_results.results | subelements('virtual_machines') }}"
      loop_control:
        label: "{{ item.0.item }}: {{ item.1.guest_name }}"

- hosts: vm_targets
  gather_facts: false
  tasks:
    - name: Gather snapshot information
      community.vmware.vmware_guest_snapshot_info:
        uuid: "{{ uuid }}"
        datacenter: "{{ vcenter_datacenter }}"
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: false
      delegate_to: localhost
      register: snapshot_results

    - name: Loop through snapshots
      ansible.builtin.include_tasks: vmware-snapshots-scheduled-delete-task4.yml
      loop: "{{ snapshot_results.guest_snapshots.snapshots }}"
      loop_control:
        loop_var: snapshot_item

Note that I do not have a vCenter environment to test against, so this code may contain minor errors. It should suffice to illustrate the principle, though.

flowerysong
  • 2,921
  • 4
  • 13
  • I tested the code, and with one small mofification dc_results.datacenter_info instead dc_results.datacenter_information it works up to the task "Add virtual machines to inventory" where it fails with an error "ERROR! A worker was found in a dead state". Also, task before that one - "Gather registered virtual machines from datacenter") took almost 2 hours. – LJS Jun 19 '23 at 09:27