Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_by_ssh unknown command returns 255 #1022

Open
Testlab1231 opened this issue Feb 12, 2025 · 11 comments
Open

check_by_ssh unknown command returns 255 #1022

Testlab1231 opened this issue Feb 12, 2025 · 11 comments

Comments

@Testlab1231
Copy link

I migrated Nagios Core 4.x from a retired CentOS server to rocky 9, and I mirrored all the configuration files after installing and configuring Nagios on the new rocky 9 server. However, on the web monitoring portal, all the client server services show " UNKNOWN" status

UNKNOWN - check_by_ssh: Remote command '/usr/local/nagios/libexec/check_disk -w 10% -c 5% -u MB -p /boot -p /boot/efi' returned status 255
I ran this command manually using a nagios user for each client server, and the command returned as expected.
Also, I ran the whole command from /usr/local/nagios/libexex to the remote servers, and it works as expected, which means it shows the status of the disk of the client server.
./check_by_ssh -o User=nagios -i /usr/local/nagios/.ssh/id_ecdsa -H $HOSTADDRESS$ -C "/usr/local/nagios/libexec/check_disk -w 10% -c 5% -u GB -p /" -E -t 60, it works,
Does anyone know why it shows UNKNOWN status on the web interface?

@everwatch
Copy link

Are you sure your Nagios user has proper ssh host keys for the remote servers?

@everwatch
Copy link

More specifically, you have to run the command on your Nagios box as the nagios user. I'm guessing you're running it as root.

@Testlab1231
Copy link
Author

The ssh key is working. because i can ssh in to the remote client servers using nagios user without password

@Testlab1231
Copy link
Author

I am running the command as a naios user, not as aroot

@everwatch
Copy link

Dumb it down a bit and see what happens when you do this on your Nagios server, as root:

su - nagios
ssh /usr/local/nagios/.ssh/id_ecdsa <remote host>

This will tell you if there are any problems with the SSH connection itself. If there are not, then log out of the remote host and do this (this assumes you're still the nagios user on your Nagios box):

ssh /usr/local/nagios/.ssh/id_ecdsa <remote host> /usr/local/nagios/libexec/check_dummy 0

This will tell you if you're able to execute remote commands properly on the remote host.

I'll wait for further response, because I still believe you have a basic SSH connection issue of some sort.

@Testlab1231
Copy link
Author

ssh - i /usr/local/nagios/.ssh/id_ecdsa
I added i in this command, and I added my remote host name. I am successfully ssh into the remote machine without asking for a password.

ssh -i /usr/local/nagios/.ssh/id_ecdsa /usr/local/nagios/libexec/check_dummy 0
The second command returns "OK"
Please le me know anything i can try

@everwatch
Copy link

Whoops, yes, I forgot the -i part. :-)

So I'm wrong - you do have a working SSH connection, which is great! So then (as the nagios user):

ssh -i /usr/local/nagios/.ssh/id_ecdsa "/usr/local/nagios/libexec/check_disk -w 10% -c 5% -u MB -p /boot -p /boot/efi"

If that works, then we have something really weird going on.

@Testlab1231
Copy link
Author

No worries, it happens
Yes, it returns.
DISK OK - free space: /boot 559 MB (61.70% inode=100%); /boot/efi 990 MB (99.29% inode=-);| /boot=347MB;875;924;0;973 /boot/efi=7MB;898;948;0;998

@everwatch
Copy link

So everything works when you pretend to be nagios, but when Nagios itself does the check, you get a timeout? Are any of your check_by_ssh checks working or are they all returning 255?

@Testlab1231
Copy link
Author

None of them are working. All check_by_ssh checks are returning a status of 255.

Ports like 22 and 53, as well as DNS services, are functioning correctly since they are not utilizing the check_by_ssh command.

@dougnazar
Copy link
Contributor

You might want to try without -E which hides stderr messages, and although -o User=nagios should work, the actual parameter for username is -l nagios.

Have you verified that the connections are being made to the sshd server? Also check ~nagios/.ssh/authorized_keys for any restrictions, and use command= or ˜nagios/.ssh/rc to test or log a working connection.

For example, put this as /tmp/nagios.sh, make executable and add command="/tmp/nagios.sh" to the correct key in ~nagios/.ssh/authorized_keys:

#!/bin/sh

echo >> /tmp/nagios.log
date >> /tmp/nagios.log

echo >> /tmp/nagios.log
env >> /tmp/nagios.log

echo >> /tmp/nagios.log
$SSH_ORIGINAL_COMMAND 2>&1 >> /tmp/nagios.log

echo OK: Test
exit 0

Hopefully something here will give some more info to go on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants