1

When I use ansible's python API to run a script on remote machines(thousands), the code is:

runner = ansible.runner.Runner(
  module_name='script',
  module_args='/tmp/get_config.py',
  pattern='*',
  forks=30
)

then, I use

datastructure = runner.run()

This takes too long. I want to insert the datastructure stdout into MySQL. What I want is if once a machine has return data, just insert the data into MySQL, then the next, until all the machines have returned.

Is this a good idea, or is there a better way?

blong
  • 2,815
  • 8
  • 44
  • 110
page
  • 52
  • 6

1 Answers1

1

The runner call will not complete until all machines have returned data, can't be contacted or the SSH session times out. Given that this is targeting 1000's of machines and you're only doing 30 machines in parallel (forks=30) it's going to take roughly Time_to_run_script * Num_Machines/30 to complete. Does this align with your expectation?

You could up the number of forks to a much higher number to have the runner complete sooner. I've pushed this into the 100's without much issue.

If you want max visibility into what's going on and aren't sure if there is one machine holding you up, you could run through each hosts serially in your python code.

FYI - this module and class is completely gone in Ansible 2.0 so you might want to make the jump now to avoid having to rewrite code later

Petro026
  • 1,269
  • 8
  • 6
  • I have an other way, put all ansible hosts into Queue, and then use threading to do while loop to get ip from Queue. – page Mar 02 '16 at 09:20