Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible 2.3.1 WinRM issue(WinRM service stopping during playbook execution) #30765

Closed
sanjeevrg473 opened this issue Sep 22, 2017 · 8 comments
Closed
Labels
affects_2.3 This issue/PR affects Ansible v2.3 bug This issue/PR relates to a bug. c:plugins/connection/winrm needs_info This issue requires further information. Please answer any outstanding questions. support:core This issue/PR relates to code supported by the Ansible Engineering Team. windows Windows community

Comments

@sanjeevrg473
Copy link

sanjeevrg473 commented Sep 22, 2017

ISSUE TYPE
  • Bug Report
COMPONENT NAME

WINRM issue with Windows Server 2012 R2

WINRM service is stopping during playbook execution

ANSIBLE VERSION

ansible 2.3.1.0
config file = /etc/ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.13 (default, Jan 11 2017, 10:56:06) [GCC]

CONFIGURATION
OS / ENVIRONMENT
SUMMARY

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ConnectionError: HTTPSConnectionPool(host='xxxx', port=5986): Max retries exceeded with url: /wsman (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f968d070990>: Failed to establish a new connection: [Errno 111] Connection refused',))
fatal: [xxx]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""}
to retry, use: --limit @/etc/ansible/xxx.retry

STEPS TO REPRODUCE

Running ansible-playbook xx.yaml to install packages on Win 2012 R2

- name: Core Package Install
   win_command: C:\temp\files\xxx_install -options C:\temp\files\xxx.opts -silent
    args:
      chdir: C:\temp\
EXPECTED RESULTS

Installation should be successful and it should move to next task within the playbook of cleaning TEMP

ACTUAL RESULTS
EXEC (via pipeline wrapper)
The full traceback is:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 125, in run
    res = self._execute()
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 522, in _execute
    result = self._handler.run(task_vars=variables)
  File "/usr/lib/python2.7/site-packages/ansible/plugins/action/normal.py", line 45, in run
    results = merge_hash(results, self._execute_module(tmp=tmp, task_vars=task_vars, wrap_async=wrap_async))
  File "/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py", line 738, in _execute_module
    res = self._low_level_execute_command(cmd, sudoable=sudoable, in_data=in_data)
  File "/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py", line 887, in _low_level_execute_command
    rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)
  File "/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py", line 339, in exec_command
    result = self._winrm_exec(cmd_parts[0], cmd_parts[1:], from_exec=True, stdin_iterator=self._wrapper_payload_stream(payload))
  File "/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py", line 296, in _winrm_exec
    self.protocol.cleanup_command(self.shell_id, command_id)
  File "/usr/lib/python2.7/site-packages/winrm/protocol.py", line 307, in cleanup_command
    res = self.send_message(xmltodict.unparse(req))
  File "/usr/lib/python2.7/site-packages/winrm/protocol.py", line 207, in send_message
    return self.transport.send_message(message)
  File "/usr/lib/python2.7/site-packages/winrm/transport.py", line 184, in send_message
    response = self.session.send(prepared_request, timeout=self.read_timeout_sec)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 504, in send
    raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='xxx', port=5986): Max retries exceeded with url: /wsman (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc4d65e3c50>: Failed to establish a new connection: [Errno 111] Connection refused',))

fatal: [xxx]: FAILED! => {
    "failed": true,
    "msg": "Unexpected failure during module execution.",
    "stdout": ""
}
        to retry, use: --limit @/etc/ansible/xxx.retry

@ansibot ansibot added affects_2.3 This issue/PR affects Ansible v2.3 bug_report needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Sep 22, 2017
@s-hertel s-hertel added c:plugins/connection/winrm windows Windows community and removed needs_triage Needs a first human triage before being processed. labels Sep 22, 2017
@jhawkesworth
Copy link
Contributor

jhawkesworth commented Sep 23, 2017

Does the playbook fail consistently at the same point?
If so, is it possible your installer in interrupting the network connection.
If this is a random failure which is not reproducable on demand you might find that increasing timeouts will help.

From this page: http://docs.ansible.com/ansible/latest/intro_windows.html

ansible_winrm_operation_timeout_sec: Increase the default timeout for WinRM operations (default: 20).
ansible_winrm_read_timeout_sec: Increase the WinRM read timeout if you experience read timeout errors (default: 30), e.g. intermittent network issues.

needs_info

@ansibot ansibot added the needs_info This issue requires further information. Please answer any outstanding questions. label Sep 23, 2017
@sanjeevrg473
Copy link
Author

@jhawkesworth Yes its failing at the last step. First task inside the playbook is to copy ZIP file to wndows server, unzip it and then run the core installer and later Clean up TEMP;
During the installation I see folders being created.. But it fails at installation step..

I did increase values mentioned by you before.. But even after changing the values its not making any impact..

ansible_winrm_operation_timeout_sec: 200
ansible_winrm_read_timeout_sec: 600

I tried async option too.. Thats failing as well

I am also assuming network issue and I am using NTLM for authentication

@ansibot ansibot removed the needs_info This issue requires further information. Please answer any outstanding questions. label Sep 23, 2017
@jhawkesworth
Copy link
Contributor

@ganjihalsanjeev - if the installer interrupts the network connection, then I think the most likely thing that might work is to use a scheduled task to run the installer.
Depending on your playbook you might need to schedule the task, then pause for a while before continuing your playbook. You might want to use the wait_for_connection module to check that there is a winrm connection available before continuing.

Also I suggest trying to get your installer to log progress to a file so you can see what it is doing and at what point it fails.

Please can you try using a scheduled task to run your installer?

needs_info

@ansibot ansibot added the needs_info This issue requires further information. Please answer any outstanding questions. label Sep 24, 2017
@dagwieers
Copy link
Contributor

@ganjihalsanjeev If you feel adventurous but want to help fixing this for good, please test out my patch at: diyan/pywinrm#174

And provide feedback.

@sanjeevrg473
Copy link
Author

@jhawkesworth Thanks for guiding me. The core installer is stopping WINRM service. That is why the service is going down. We will update our installer package.

@dagwieers Thank you! This issue is totally related to the installer that I am using. The installer is the culprit.

@dagwieers
Copy link
Contributor

dagwieers commented Sep 27, 2017

@sanjeevrg473 Is it stopping the WinRM service, or restarting it ? Both would have the exact same effects on Ansible at this time. But the fix at diyan/pywinrm#174 will recover by trying 5 times with 5 seconds delay in order to let Ansible continue to work when the WinRM service is restarted.

@sanjeevrg473
Copy link
Author

@dagwieers So you want me to use include the transport.py in the playbooks location? Do I have to make any changes in ansible.cfg?

@dagwieers
Copy link
Contributor

So edit your pywinrm's transport.py by hand and perform the patch that is in the PR if you would like to test this. Replacing transport.py is not recommended, because it depends on the pywinrm version you have in use.

No changes in ansible.cfg are necessary. (Reading that python code should be pretty obvious in to what it does)

@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 7, 2018
@ansible ansible locked and limited conversation to collaborators Apr 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.3 This issue/PR affects Ansible v2.3 bug This issue/PR relates to a bug. c:plugins/connection/winrm needs_info This issue requires further information. Please answer any outstanding questions. support:core This issue/PR relates to code supported by the Ansible Engineering Team. windows Windows community
Projects
None yet
Development

No branches or pull requests

5 participants