-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catched SIGSEGV от scm_send_msg (/root/src/event-driven-servers/mavis/spawnd_accepted.c:422), pid: 378104: Connection refused #142
Comments
Hi, thanks for reporting. This apparently happens somewhere in the error path after spawnd_accepted.c:422. I really can't read much more from the syslog output, compiling with --debug or --ggdb and capturing the SEGV with a debugger would really be helpful. For starters, could you please share some details about you setup? E.g. is tracking configured, or might that be the result of a high-usage situation? Thanks, Marc |
Hi, I've just commited 443023f, no idea whether that fixes the issue, this is just based on guessing. Thanks, Marc |
Hi, thanks for the update, I'll keep an eye on it, but a possible reason could be a heavy load, I have more than 1500 devices on my network, and possibly due to memory leaks, since after two days of work the consumption increased from 4 MB to 300+ MB (Memory: 332.5M (peak: 340.2M)) |
Hi, that memory leak is clearly unexpected. Could you please share some details about your configuration (mavis modules used, e.g.)? Thanks, Marc |
Hi, id = tac_plus-ng {
|
Hi, thanks, that should make it easier to find what's causing this. Might take some time, nevertheless, Thanks, Marc |
Hi, do logs actually get written to /var/log/tac_plus/...? That location being non-writable would be a fairly good excuse for memory filling up. Just guessing. Thanks, Marc |
Hello, |
Hi, thanks, so I'll look further into finding that memory leak :-( The "single-connection" warning should go away after defining "single-connection = yes" at host or realm level. Thanks, Marc |
Hi, But I noticed that after restart the memory is not freed: `root@aaa1:~/src/event-driven-servers# service tac_plus-ng status root@aaa1:~/src/event-driven-servers# service tac_plus-ng status |
Hi, please forcefully "kill -9 3176288 3176289" (or whatever the process ids are now), 1474064 and 3176288/3176289 don't look related to me. After that, please restart tac_plus-ng (and make certain to be at current GIT level). Does the issue still occur after that? If yes, how long does it take until non-sane memory usage? Do you see any relation to the number of requests served? Thanks, Marc |
Hi, |
Hi, |
Hi, I simply can't test that, due to lack of suitable hardware. May or may not be an issue with IOS-XE, or with tac_plus-ng. In general, if restarting tac_plus-ng causes an issue with devices it looks more like a problem with the device implementation. Cheers, Marc |
Hi, |
Hi, I see no issues with single connection on a C8000v platform running 17.12.04. I'd guess that that's the same TACACS+ client implementation. You could perhaps isolate the issue by enabling tacaca debugging on your switch. I'm still looking into the memory leak issue. For now, mitigating it with "retire limit = <max. number of connections to be served>" and "connection timeout = " should work. Cheers, Marc |
Hello,
sometimes I get errors like :
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: scm_send_msg: sendmsg: Connection refused
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: scm_send_msg (/root/src/event-driven-servers/mavis/spawnd_accepted.c:422), pid: 378104: Connection refused
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: Catched SIGSEGV
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: Apologies ... this shouldn't have happened. Please verify that
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: you are running the most current version from
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: https://github.com/MarcJHuber/event-driven-servers/
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: The version you're currently running is 144c6de
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: If this issue persists even with the most recent version:
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: Reconfigure with --debug, recompile and reinstall. Then send
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: a bug report via GitHub issues at
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: https://github.com/MarcJHuber/event-driven-servers/issues
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: and include the backtraces.
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: Please do NOT mail bug reports to the private mail address of
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: the author, unless you have a prior permit for that.
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: Thank you.
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: backtrace start
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 10 [0x62198f040335]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 9 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x72234de2a28b]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 8 /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x72234de2a1ca]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 7 [0x62198f057e64]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 6 /usr/local/lib/libmavis.so(scm_main+0x103) [0x72234e883bd3]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 5 /usr/local/lib/libmavis.so(spawnd_main+0x976) [0x72234e883806]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 4 /usr/local/lib/libmavis.so(io_main+0x2c) [0x72234e875fdc]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 3 /usr/local/lib/libmavis.so(io_poll+0xf1) [0x72234e8752e1]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 2 /usr/local/lib/libmavis.so(spawnd_accepted+0x967) [0x72234e880c97]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 1 /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x72234de45320]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: 0 /usr/local/lib/libmavis.so(catchsegv+0x105) [0x72234e884825]
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: EXECINFO: backtrace end
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: GDB: running: "(printf 'bt#012q#012';sleep 3)|gdb -n -q -p $CRASHPID 2>/dev/null"
Feb 5 06:15:53 ams-aaa1 tac_plus-ng[378100]: GDB: backtrace start
Feb 5 06:15:56 ams-aaa1 tac_plus-ng[378100]: GDB: backtrace end
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Main process exited, code=exited, status=69/UNAVAILABLE
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Failed with result 'exit-code'.
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Unit process 345581 (tac_plus-ng) remains running after unit stopped.
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Unit process 378102 (tac_plus-ng) remains running after unit stopped.
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Consumed 6.869s CPU time, 236.2M memory peak, 0B memory swap peak.
Feb 5 06:15:56 ams-aaa1 systemd[1]: tac_plus-ng.service: Scheduled restart job, restart counter is at 7.
The text was updated successfully, but these errors were encountered: