Délai d'expiration de la connexion après 10 secondes

Solved
Ben' Posted messages 1081 Status Membre -  
mamiemando Posted messages 33541 Registration date   Status Modérateur Last intervention   -
Hello!

I am reaching out for your expertise to address my problem related to the supervision of one of my machines. Indeed, my server is responding with: CHECK_Socket timeout after 10 seconds. To say the least, it's not great. It's worth noting that this occurred when I restarted the server, which is a VM under HyperV.

Below are my logs and my Nagios configuration on my client:

The famous NRPE.conf:
root# cat /etc/nagios/nrpe.cfg
log_facility=@log_facility@ pid_file=/var/run/nrpe.pid server_port=5666 nrpe_user=@nrpe_user@ nrpe_group=@nrpe_group@ allowed_hosts=127.0.0.1,IP-Nagios dont_blame_nrpe=1 debug=0 command_timeout=60 connection_timeout=300 command[check_disk]=/usr/lib/nagios/plugins/check_disk -w 80 -c 90 /dev/ command[check_procs]=/usr/lib/nagios/plugins/check_procs command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10 command[check_load]=/usr/lib/nagios/plugins/check_load -w 80 -c 90 command[check_swap]=/usr/lib/nagios/plugins/check_swap -w 90% -c 80% command[check_mem]=/usr/lib/nagios/plugins/check_memuse.sh -w 80 -c 95


To provide you with my Linux version:
root# uname -a
Linux g*.*.* 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux 


To show you the permissions applied on the Nagios config files:
root# ls -lr /etc/nagios/
total 100 -rw-r--r-- 1 nagios nagios 420 Aug 3 09:00 nrpe_local.cfg drwxr-xr-x 2 nagios nagios 4096 Jan 13 2014 nrpe.d -rw-r--r-- 1 nagios nagios 87268 Aug 3 10:39 nrpe.cfg.save -rw-r--r-- 1 nagios nagios 660 Aug 8 13:54 nrpe.cfg


And now my biggest question, it is indeed about network data, but honestly I've never encountered such a syslog:
root# tail -f /var/log/syslog
Aug 8 14:10:54 g***-** kernel: [3914.834891] [UFW BLOCK] IN=eth0 OUT= MAC=******************************** SRC=IP_Nagios DST=IP_ServeurLocal LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29974 DF PROTO=TCP SPT=43704 DPT=5666 WINDOW=5840 RED=0x00 SYN URGP=0


The daemon:
root# ps -aux | grep nrpe
****** 34511 0.0 0.0 23332 1104 ? Ss Aug 3 0:10 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d


Then the port:
root# netstat -an | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN tcp6 0 0 :::5666 :::* LISTEN


And there you go :) I’m stuck on this syslog without really understanding it..
I hope I haven’t discouraged you.

Thank you in advance!

3 réponses

mamiemando Posted messages 33541 Registration date   Status Modérateur Last intervention   7 935
 
Hello,

Explanation of your problem

/etc/nagios/nrpe.cfg :

allowed_hosts=127.0.0.1,IP-Nagios


I don't know where you are connecting from, but this IP already suggests that your client must be on the same machine as the server for the server to allow it to connect.

/var/log/syslog

Aug 8 14:10:54 g***-** kernel: [3914.834891] [UFW BLOCK] IN=eth0 OUT= MAC=******************************** SRC=IP_Nagios DST=IP_ServeurLocal LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29974 DF PROTO=TCP SPT=43704 DPT=5666 WINDOW=5840 RED=0x00 SYN URGP=0


This line is written by your firewall (
ufw
) and indicates that a packet arriving via eth0, sent by IP_Nagios, destined for IP_ServeurLocal was blocked (source port 43704, destination port 5666, etc...)
https://doc.ubuntu-fr.org/ufw

If this line corresponds to a connection made by your client, then:

1) the problem does not come from the origin of the client (see what I said about nrpe.cfg)

2) the firewall is blocking the packet sent by your client. Since the packet was blocked by the firewall before it reached the server, the server does not respond because it thinks it has nothing to do. As a result, the client patiently waits 10 seconds for a response from the server, and in despair, gives up with the error message
CHECK_Socket timeout after 10 seconds
.

In conclusion, the problem comes from your firewall, for which you need to add a rule that allows this kind of packet to pass through.

To resolve your problem

If the client is on the same machine as the server, it would be wiser for it to contact it via
127.0.0.1
, as it will then not be subject to the same rules in your firewall. It's better than opening the firewall and potentially allowing anyone to access the Nagios server.

If the client and the server are not on the same machine, it would be good to only allow the client's IP to make such an access (assuming this IP is fixed or within a range of fixed IPs) to limit the risks of intrusions.

Good luck.
1
mamiemando Posted messages 33541 Registration date   Status Modérateur Last intervention   7 935
 
Thank you for your feedback :-)

Even though I don't understand the presence of ufw on this machine...

Oh, nothing too complicated, it's just that under Ubuntu, a number of packages/tools are pre-installed because they are considered useful for the average user by the Ubuntu maintainers, and
ufw
is one of them. Under Debian, only iptables would be installed, for example.

Best wishes!
1
Ben' Posted messages 1081 Status Membre 142
 
"deemed useful for the common folk by the maintainers of ubuntu"
I love it!
0
mamiemando Posted messages 33541 Registration date   Status Modérateur Last intervention   7 935
 
Well :-) One has to admit that the syntax of ufw is a bit more intuitive than that of iptables ;-)
0
Ben' Posted messages 1081 Status Membre 142
 
Hello,

First of all, thank you for this response! It definitely deserves some UPs for its accuracy.

The Nagios server is remote, it is indeed a separate server. It is true that it can be unnecessary to invoke localhost's IP in the allowed_host; it just allows me to test commands locally (generally). I have therefore removed it to only leave the IP of the Nagios server.

The resolution of the problem:

root# ufw allow 5666/tcp

This allows data to pass through port 5666 (Dedicated to NRPE services).

And everything works!

Thanks again!
We fixed the problem, and I learned something.
Even though I don't understand the presence of ufw on this machine...
0