Délai d'expiration de la connexion après 10 secondes
Solved
Hello!
I am reaching out for your expertise to address my problem related to the supervision of one of my machines. Indeed, my server is responding with: CHECK_Socket timeout after 10 seconds. To say the least, it's not great. It's worth noting that this occurred when I restarted the server, which is a VM under HyperV.
Below are my logs and my Nagios configuration on my client:
The famous NRPE.conf:
root# cat /etc/nagios/nrpe.cfg
To provide you with my Linux version:
root# uname -a
To show you the permissions applied on the Nagios config files:
root# ls -lr /etc/nagios/
And now my biggest question, it is indeed about network data, but honestly I've never encountered such a syslog:
root# tail -f /var/log/syslog
The daemon:
root# ps -aux | grep nrpe
Then the port:
root# netstat -an | grep 5666
And there you go :) I’m stuck on this syslog without really understanding it..
I hope I haven’t discouraged you.
Thank you in advance!
I am reaching out for your expertise to address my problem related to the supervision of one of my machines. Indeed, my server is responding with: CHECK_Socket timeout after 10 seconds. To say the least, it's not great. It's worth noting that this occurred when I restarted the server, which is a VM under HyperV.
Below are my logs and my Nagios configuration on my client:
The famous NRPE.conf:
root# cat /etc/nagios/nrpe.cfg
log_facility=@log_facility@ pid_file=/var/run/nrpe.pid server_port=5666 nrpe_user=@nrpe_user@ nrpe_group=@nrpe_group@ allowed_hosts=127.0.0.1,IP-Nagios dont_blame_nrpe=1 debug=0 command_timeout=60 connection_timeout=300 command[check_disk]=/usr/lib/nagios/plugins/check_disk -w 80 -c 90 /dev/ command[check_procs]=/usr/lib/nagios/plugins/check_procs command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10 command[check_load]=/usr/lib/nagios/plugins/check_load -w 80 -c 90 command[check_swap]=/usr/lib/nagios/plugins/check_swap -w 90% -c 80% command[check_mem]=/usr/lib/nagios/plugins/check_memuse.sh -w 80 -c 95
To provide you with my Linux version:
root# uname -a
Linux g*.*.* 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
To show you the permissions applied on the Nagios config files:
root# ls -lr /etc/nagios/
total 100 -rw-r--r-- 1 nagios nagios 420 Aug 3 09:00 nrpe_local.cfg drwxr-xr-x 2 nagios nagios 4096 Jan 13 2014 nrpe.d -rw-r--r-- 1 nagios nagios 87268 Aug 3 10:39 nrpe.cfg.save -rw-r--r-- 1 nagios nagios 660 Aug 8 13:54 nrpe.cfg
And now my biggest question, it is indeed about network data, but honestly I've never encountered such a syslog:
root# tail -f /var/log/syslog
Aug 8 14:10:54 g***-** kernel: [3914.834891] [UFW BLOCK] IN=eth0 OUT= MAC=******************************** SRC=IP_Nagios DST=IP_ServeurLocal LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29974 DF PROTO=TCP SPT=43704 DPT=5666 WINDOW=5840 RED=0x00 SYN URGP=0
The daemon:
root# ps -aux | grep nrpe
****** 34511 0.0 0.0 23332 1104 ? Ss Aug 3 0:10 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
Then the port:
root# netstat -an | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN tcp6 0 0 :::5666 :::* LISTEN
And there you go :) I’m stuck on this syslog without really understanding it..
I hope I haven’t discouraged you.
Thank you in advance!
3 réponses
Hello,
Explanation of your problem
/etc/nagios/nrpe.cfg :
I don't know where you are connecting from, but this IP already suggests that your client must be on the same machine as the server for the server to allow it to connect.
/var/log/syslog
This line is written by your firewall (
https://doc.ubuntu-fr.org/ufw
If this line corresponds to a connection made by your client, then:
1) the problem does not come from the origin of the client (see what I said about nrpe.cfg)
2) the firewall is blocking the packet sent by your client. Since the packet was blocked by the firewall before it reached the server, the server does not respond because it thinks it has nothing to do. As a result, the client patiently waits 10 seconds for a response from the server, and in despair, gives up with the error message
In conclusion, the problem comes from your firewall, for which you need to add a rule that allows this kind of packet to pass through.
To resolve your problem
If the client is on the same machine as the server, it would be wiser for it to contact it via
If the client and the server are not on the same machine, it would be good to only allow the client's IP to make such an access (assuming this IP is fixed or within a range of fixed IPs) to limit the risks of intrusions.
Good luck.
Explanation of your problem
/etc/nagios/nrpe.cfg :
allowed_hosts=127.0.0.1,IP-Nagios
I don't know where you are connecting from, but this IP already suggests that your client must be on the same machine as the server for the server to allow it to connect.
/var/log/syslog
Aug 8 14:10:54 g***-** kernel: [3914.834891] [UFW BLOCK] IN=eth0 OUT= MAC=******************************** SRC=IP_Nagios DST=IP_ServeurLocal LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29974 DF PROTO=TCP SPT=43704 DPT=5666 WINDOW=5840 RED=0x00 SYN URGP=0
This line is written by your firewall (
ufw) and indicates that a packet arriving via eth0, sent by IP_Nagios, destined for IP_ServeurLocal was blocked (source port 43704, destination port 5666, etc...)
https://doc.ubuntu-fr.org/ufw
If this line corresponds to a connection made by your client, then:
1) the problem does not come from the origin of the client (see what I said about nrpe.cfg)
2) the firewall is blocking the packet sent by your client. Since the packet was blocked by the firewall before it reached the server, the server does not respond because it thinks it has nothing to do. As a result, the client patiently waits 10 seconds for a response from the server, and in despair, gives up with the error message
CHECK_Socket timeout after 10 seconds.
In conclusion, the problem comes from your firewall, for which you need to add a rule that allows this kind of packet to pass through.
To resolve your problem
If the client is on the same machine as the server, it would be wiser for it to contact it via
127.0.0.1, as it will then not be subject to the same rules in your firewall. It's better than opening the firewall and potentially allowing anyone to access the Nagios server.
If the client and the server are not on the same machine, it would be good to only allow the client's IP to make such an access (assuming this IP is fixed or within a range of fixed IPs) to limit the risks of intrusions.
Good luck.
Thank you for your feedback :-)
Even though I don't understand the presence of ufw on this machine...
Oh, nothing too complicated, it's just that under Ubuntu, a number of packages/tools are pre-installed because they are considered useful for the average user by the Ubuntu maintainers, and
Best wishes!
Even though I don't understand the presence of ufw on this machine...
Oh, nothing too complicated, it's just that under Ubuntu, a number of packages/tools are pre-installed because they are considered useful for the average user by the Ubuntu maintainers, and
ufwis one of them. Under Debian, only iptables would be installed, for example.
Best wishes!
Hello,
First of all, thank you for this response! It definitely deserves some UPs for its accuracy.
The Nagios server is remote, it is indeed a separate server. It is true that it can be unnecessary to invoke localhost's IP in the allowed_host; it just allows me to test commands locally (generally). I have therefore removed it to only leave the IP of the Nagios server.
The resolution of the problem:
root# ufw allow 5666/tcp
This allows data to pass through port 5666 (Dedicated to NRPE services).
And everything works!
Thanks again!
We fixed the problem, and I learned something.
Even though I don't understand the presence of ufw on this machine...
First of all, thank you for this response! It definitely deserves some UPs for its accuracy.
The Nagios server is remote, it is indeed a separate server. It is true that it can be unnecessary to invoke localhost's IP in the allowed_host; it just allows me to test commands locally (generally). I have therefore removed it to only leave the IP of the Nagios server.
The resolution of the problem:
root# ufw allow 5666/tcp
This allows data to pass through port 5666 (Dedicated to NRPE services).
And everything works!
Thanks again!
We fixed the problem, and I learned something.
Even though I don't understand the presence of ufw on this machine...