1

i have a problem with ansible: I wrote some playbooks, that do basic things like getting the routeros version from the mikrotik, and the playbook itself seems to works fine. What i mean by that is that about half of the mikrotiks (all in one network, all reachbale via ssh, all with the same firewall settings ect) send the information that i requested just fine. But with the other half i get this strange error (see below).

I tested it intensively with two mikrotiks ( RB2011UiAS). With one of them working, and the other one causes the exeption seen below. I compared the config side by side, other then the ip´s (in the same network) everything is 100% the same. Even the software version. Both are reachable via ssh.

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.module_utils.connection.ConnectionError: timeout value 10 seconds reached while trying to send command: /system resource print
fatal: [XXX:XXX:XXX:X::XX]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/root/.ansible/tmp/ansible-local-22921xL1Zh9/ansible-tmp-1598512873.3-22929-127716503250274/AnsiballZ_routeros_command.py\", line 102, in <module>\n _ansiballz_main()\n File \"/root/.ansible/tmp/ansible-local-22921xL1Zh9/ansible-tmp-1598512873.3-22929-127716503250274/AnsiballZ_routeros_command.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/root/.ansible/tmp/ansible-local-22921xL1Zh9/ansible-tmp-1598512873.3-22929-127716503250274/AnsiballZ_routeros_command.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible.modules.network.routeros.routeros_command', init_globals=None, run_name='__main__', alter_sys=True)\n File \"/usr/lib/python2.7/runpy.py\", line 188, in run_module\n fname, loader, pkg_name)\n File \"/usr/lib/python2.7/runpy.py\", line 82, in _run_module_code\n mod_name, mod_fname, mod_loader, pkg_name)\n File \"/usr/lib/python2.7/runpy.py\", line 72, in _run_code\n exec code in run_globals\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/modules/network/routeros/routeros_command.py\", line 187, in <module>\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/modules/network/routeros/routeros_command.py\", line 157, in main\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/module_utils/network/routeros/routeros.py\", line 125, in run_commands\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/module_utils/network/routeros/routeros.py\", line 55, in get_connection\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/module_utils/network/routeros/routeros.py\", line 69, in get_capabilities\n File \"/tmp/ansible_routeros_command_payload_VN97ME/ansible_routeros_command_payload.zip/ansible/module_utils/connection.py\", line 185, in __rpc__\nansible.module_utils.connection.ConnectionError: timeout value 10 seconds reached while trying to send command: /system resource print\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

(Xed out the ipv6)

Here is the playbook:

    ---
 - name: Check mikrotik os version
   hosts: mikrotik
   gather_facts: no

   tasks:
     - name: CHeck OS
       routeros_command:
         commands: /system package print
       register: version_output

     - name: Disply version
       debug:
         var: version_output

executing the command manually on the device works.

I tried it with another playbook, with mikrotiks with older and brand new firmware, set the timeout to 120 seconds, nothing has worked so far. I also am aware of the problems that some symboles in the username cause, and thats not the case here.

If you need any more informations i would be happy to provide those. If anyone has an idea what could cause this problem, i would be more then happy.

ThePhenex
  • 11
  • 4
  • Show the complete playbook xxx'ing your sensitive info alone – Patrick Aug 27 '20 at 14:54
  • According to the error, your connection to the router is timing out: `ansible.module_utils.connection.ConnectionError: timeout value 10 seconds reached while trying to send command: /system resource print` – larsks Aug 27 '20 at 15:19
  • @larsks the connection to the devices that fail is stable. I was connected to the device that i tested with while the error occured and the connection did not get interrupted. The connection is also constantly monitored in real time, so a timeout would have been visible. I can also see the successful login of ansible on the device itself. – ThePhenex Aug 27 '20 at 15:36
  • @Patrick the entire playbook is in my question, or what is it that you are asking for? – ThePhenex Aug 27 '20 at 15:38
  • 1
    Did you run the playbook with `-vvv` as advised by your error message to see if you could get any other valuable info ? – Zeitounator Aug 27 '20 at 19:31
  • @Zeitounator Yes, and it displayed all the scripts that were run in order to execute the playbook, all ran without failure and the exeption that was put out was the same one i pasted in my question. I just dont get how the connection error came to be when i am 100% sure that in that case and all the other failed cases the connection did not get interrupted. – ThePhenex Aug 28 '20 at 06:32
  • I've closed this as needing debugging details, because there's no way anyone besides you could be certain what the issue is from the information in your question without guessing. You would need to add the hostnames (or, preferably, fake hostnames with the same lengths). – Makyen Nov 10 '20 at 19:05
  • I already added my answer on how to fix the problem below a few month ago :) – ThePhenex Jan 14 '21 at 15:58

2 Answers2

0

I fixed it. The thing that caused my Problem was that the hostnames of the mikrotik cpes were sometimes longer then 31 chars, which due to the control path lengh limit of ansible, will cause this error. Renaming them made it work.

ThePhenex
  • 11
  • 4
-1

Check if your identity in the router has a special char like /, routeros_command calls the resource print and get messed up by the identity. Well at least it worked for me...