The error, Logs Not Searchable or Not Coming In Nagios Log Server occur while we run a query in a dashboard.
As part of our Server Management Services, we assist our customers with several Nagios errors.
Today, let’s find out a problematic solution suggested by our experienced Server Admins.
Logs Not Searchable or Not Coming In Nagios Log Server
As we mentioned earlier, while we run a query in a dashboard, logs may not show up when they should. In this article, we will use a scenario of a remote server sending
syslogs
Copy Code
to help provide a clear troubleshooting path.
- Log Server
Name: nls-c7x-x64
IP: 10.25.5.86
Listening Port: TCP 5544
- Remote Server Sending Logs
Name: centos12
IP: 10.25.13.30
Sending Port: TCP 5544
OS: CentOS 6.7 x64
Furthermore, let us focus on the methods our Support Techs employ in order to fix Logs Not Searchable or Not Coming In Nagios Log Server.
Remote Server – Check Rsyslog Config
This server has already been set up to send logs to
nls-c7x-x64
Copy Code
using the setup steps in the Log Server GUI.
To confirm this has been done, we create a file and it contains:
/etc/rsyslog.d/99-nagioslogserver.conf
Copy Code
### Begin forwarding rule for Nagios Log Server NAGIOSLOGSERVER
$WorkDirectory /var/lib/rsyslog # Where spool files will live NAGIOSLOGSERVER
$ActionQueueFileName nlsFwdRule0 # Unique name prefix for spool files NAGIOSLOGSERVER
$ActionQueueMaxDiskSpace 1g # 1GB space limit (use as much as possible) NAGIOSLOGSERVER
$ActionQueueSaveOnShutdown on # Save messages to disk on shutdown NAGIOSLOGSERVER
$ActionQueueType LinkedList # Use asynchronous processing NAGIOSLOGSERVER
$ActionResumeRetryCount -1 # Infinite retries if host is down NAGIOSLOGSERVER
# Remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional NAGIOSLOGSERVER
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
### End of Nagios Log Server forwarding rule NAGIOSLOGSERVER
Copy Code
It is important to note here the following line:
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
Copy Code
It is assumed that the server centos12 can resolve the address
nls-c7x-x64
Copy Code
, otherwise, it will not be able to send it logs.
To confirm this, we execute the following command on centos12:
ping nls-c7x-x64 -c 1
Copy Code
We expect an output similar to this if it can successfully resolve
nls-c7x-x64
Copy Code
:
PING nls-c7x-x64.box293.local (10.25.5.86) 56(84) bytes of data.
64 bytes from nls-c7x-x64.box293.local (10.25.5.86): icmp_seq=1 ttl=64 time=0.273 ms
— nls-c7x-x64.box293.local ping statistics —
1 packets transmitted, 1 received, 0% packet loss, time 2ms
rtt min/avg/max/mdev = 0.273/0.273/0.273/0.000 ms
Copy Code
On the other hand, we can expect an output similar to this if it cannot resolve
nls-c7x-x64
Copy Code
:
ping: unknown host nls-c7x-x64
Copy Code
Going back to that config line:
*.* @@nls-c7x-x64:5544 # NAGIOSLOGSERVER
Copy Code
The @@ indicates that the port type is TCP and the port number is 5544.
If it was UDP, there would only be one @.
Remote Server – Check Rsyslog Is Running
Assuming the config is correct, we may want to make sure that
rsyslogd
Copy Code
is running:
service rsyslog status
Copy Code
We can expect an output like this if it is running:
rsyslogd (pid 2098) is running…
Copy Code
Also, we can expect an output similar to this is if it is not running:
rsyslogd is stopped
Copy Code
Subsequently, if it is not running, we should start it:
service rsyslog start
Copy Code
Remote Server – Check Firewall Rules
We want to make sure that the
iptables
Copy Code
firewall allows outbound traffic. By default, there are no restrictions on outbound traffic.
To confirm this, we execute the following command:
iptables –list
Copy Code
We expect an output similar to this:
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all — anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp — anywhere anywhere
ACCEPT all — anywhere anywhere
ACCEPT tcp — anywhere anywhere state NEW tcp dpt:ssh
REJECT all — anywhere anywhere reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
target prot opt source destination
REJECT all — anywhere anywhere reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Copy Code
Specifically, this last output is what we need to look at:
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Copy Code
The first line has (ACCEPT) which means there is no restriction at the top level (it would say DROP if there was).
The second line is simply headings for all the outbound rules that have been defined. Because there is no third line, there are NO outbound rules defined so the default here is to ACCEPT all outbound traffic (allow it).
If we had a restricted environment where outbound rules were DROP, we would need to add an outbound firewall rule for TCP port 5544 to nls-c7x-x64 on 10.25.5.86:
iptables -I OUTPUT -p tcp –destination-port 5544 -d 10.25.5.86 -j ACCEPT
service iptables save
Copy Code
Remote Server – Watch Outbound Traffic
To confirm that the log traffic is leaving the remote server we can run a
tcpdump
Copy Code
to watch the traffic.
First, we must install
tcpdump
Copy Code
:
yum -y install tcpdump
Copy Code
Wait while
tcpdump
Copy Code
is installed.
Now we execute the following command to watch the traffic:
tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86
Copy Code
We will receive this message first:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
Copy Code
An example of traffic flow is as follows:
16:43:49.001130 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017526:2751017581, ack 431734015, win 115, options [nop,nop,TS val 93111400 ecr 92667575], length 55
Copy Code
If we do not see any traffic, nothing is being logged and hence there is nothing to send. We can easily add a test entry to
rsyslog
Copy Code
which will generate traffic:
- Open an additional ssh session to the remote server
- Execute the following command:
logger TroubleshootingTest
Copy Code
In our other SSH session, we should now see a line of traffic that confirms that
rsyslog
Copy Code
is sending the logs onto nls-c7x-x64
Copy Code
.
Press Ctrl C to stop the
tcpdump
Copy Code
.
Log Server – Watch Inbound Traffic
To confirm that the log traffic is entering the log server we can run a
tcpdump.
Copy Code
This is similar to the previous steps except it confirms that the traffic has made through any routers or firewalls between the remote server and the log server.
First, we must install
tcpdump
Copy Code
with this command:
RHEL|CentOS
yum install -y tcpdump
Copy Code
Debian|Ubuntu
apt-get install -y tcpdump
Copy Code
Now we execute the following command to watch the traffic:
tcpdump src host 10.25.13.30 and tcp dst port 5544 and dst host 10.25.5.86
Copy Code
We will receive this message first:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
Copy Code
An example of traffic flow is as follows:
16:52:42.509481 IP centos12.box293.local.60907 > nls-c7x-x64.box293.local.5544: Flags [P.],
seq 2751017651:2751017706, ack 431734015, win 115, options [nop,nop,TS val 93644443 ecr 92674681], length 55
Copy Code
If we do not see any traffic, it may just be that nothing is being logged and hence there is nothing to send. We can easily add a test entry to
rsyslog
Copy Code
which will generate traffic:
- Open an additional ssh session to the remote server
- Execute the following command:
logger TroubleshootingTest
Copy Code
In our log server SSH session, we should now see a line of traffic that confirms that the traffic is hitting the log server.
Press Ctrl C to stop the
tcpdump
Copy Code
.
If we do not see any traffic, then there may be a firewall or router blocking the traffic.
Log Server – Check Firewall Rules
We want to make sure that the
iptables
Copy Code
firewall allows inbound traffic. By default there are restrictions on inbound traffic however Nagios Log Server creates the firewall rules to allow the traffic.
-
RHEL 6|CentOS 6
There are separate firewall daemons for IPv4 and IPv6 and hence our Support Techs suggest separate commands.
First, check the status of the firewall:
IPv4
service iptables status
Copy Code
IPv6
service ip6tables status
Copy Code
If the firewall is running, it should produce output like:
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all — 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp — 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all — 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
5 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2057
6 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:2056
7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544
8 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:3515
9 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpts:9300:9400
10 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
11 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:5544
13 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all — 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Copy Code
Specifically, these lines tell us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:
7 ACCEPT tcp — 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:5544
12 ACCEPT udp — 0.0.0.0/0 0.0.0.0/0 state NEW udp dpt:554
Copy Code
If these firewall rules do not exist, we add them by executing the following commands:
IPv4
iptables -I INPUT -p udp –dport 5544 -j ACCEPT
iptables -I INPUT -p tcp –dport 5544 -j ACCEPT
service iptables save
Copy Code
IPv6
ip6tables -I INPUT -p udp –dport 5544 -j ACCEPT
ip6tables -I INPUT -p tcp –dport 5544 -j ACCEPT
service ip6tables save
Copy Code
If the firewall is not running, it will produce this output:
iptables: Firewall is not running.
Copy Code
If the firewall is not running, this means that inbound traffic is allowed.
To enable the firewall on boot and to start it, we execute the following commands:
IPv4
chkconfig iptables on
service iptables start
Copy Code
IPv6
chkconfig ip6tables on
service ip6tables start
Copy Code
-
RHEL 7|CentOS 7
First, check the status of the firewall:
systemctl status firewalld.service
Copy Code
If the firewall is running, it should produce output like:
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2018-12-13 11:16:59 AEDT; 39min ago
Docs: man:firewalld(1)
Main PID: 670 (firewalld)
CGroup: /system.slice/firewalld.service
└─670 /usr/bin/python -Es /usr/sbin/firewalld –nofork –nopid
Copy Code
Similarly, if the firewall is not running, it will produce this output:
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2018-12-13 11:57:15 AEDT; 1s ago
Docs: man:firewalld(1)
Process: 670 ExecStart=/usr/sbin/firewalld –nofork –nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 670 (code=exited, status=0/SUCCESS)
Copy Code
If the firewall is not running, this means that inbound traffic is allowed.
To enable the firewall on boot and to start it, we execute the following commands:
systemctl enable firewalld.service
systemctl start firewalld.service
Copy Code
To list the firewall rules execute this command:
firewall-cmd –list-all
Copy Code
Which should produce output like:
public (active)
target: default
icmp-block-inversion: no
interfaces: ens32
sources:
services: dhcpv6-client ssh
ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
Copy Code
Specifically, the ports line tells us that the firewall rule exists and is allowing inbound UDP and TCP traffic on port 5544:
ports: 80/tcp 443/tcp 9300-9400/tcp 3515/tcp 5544/tcp 2056/tcp 2057/tcp 5544/udp
Copy Code
If these firewall rules do not exist, they can be added by executing the following commands:
firewall-cmd –zone=public –add-port=5544/udp
firewall-cmd –zone=public –add-port=5544/tcp
firewall-cmd –reload
Copy Code
-
Debian
Debian has the
iptables
Copy Code
firewall installed but not enabled by default. The firewall rules are maintained by the netfilter-persistent service
Copy Code
.
We can determine if it is installed with the following command:
systemctl status netfilter-persistent.service
Copy Code
If we receive this output then there is no firewall service active on our Debian machine:
Unit netfilter-persistent.service could not be found.
Copy Code
This means all inbound traffic is allowed, we will receive SNMP Traps.
If we receive this output then the firewall service is active on our Debian machine:
● netfilter-persistent.service – netfilter persistent configuration
Loaded: loaded (/lib/systemd/system/netfilter-persistent.service; enabled)
Active: active (exited) since Tue 2018-11-27 14:24:11 AEDT; 1min 26s ago
Main PID: 17749 (code=exited, status=0/SUCCESS)
Copy Code
If the
netfilter-persistent service
Copy Code
is enabled we can now check the status of the firewall:
iptables –list
Copy Code
An open firewall-config would produce output like:
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Copy Code
We can see no rules exist.
If a rule did exist allowing inbound UDP traffic on port 162 then it would look like this:
target prot opt source destination
ACCEPT udp — anywhere anywhere udp dpt:snmp-trap
Copy Code
If these firewall rules do not exist, they can be added by executing the following commands:
iptables -I INPUT -p udp –destination-port 5544 -j ACCEPT
iptables -I INPUT -p tcp –destination-port 5544 -j ACCEPT
Copy Code
-
Ubuntu
Ubuntu uses the Uncomplicated Firewall (
ufw
Copy Code
) to manage firewall rules however it is not enabled on a default install.
We can check it with the following command:
ufw status
Copy Code
If the firewall is not running, it will produce this output:
Status: inactive
Copy Code
Meanwhile, if the firewall is running, it should produce output like:
Status: active
Copy Code
If the firewall is not running, this means that inbound traffic is allowed (we will receive SNMP Traps).
To enable the firewall on boot and to start it, we execute the following command:
ufw enable
Copy Code
Be careful executing this command, we will not be able to access the server when it next reboots. Its default configuration is to deny all incoming connections. We will need to add rules for all the different ports connect to this server.
To list the firewall rules we execute this command:
ufw status verbose
Copy Code
Which should produce output like:
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), disabled (routed)
New profiles: skip
To Action From
— —— —-
5544/udp ALLOW IN Anywhere
5544/tcp ALLOW IN Anywhere
5544/udp (v6) ALLOW IN Anywhere (v6)
5544/tcp (v6) ALLOW IN Anywhere (v6)
Copy Code
We can see from the output that firewall rules exist allowing inbound UDP and TCP traffic on port 5544.
If these firewall rules do not exist, they can be added by executing the following commands:
ufw allow proto udp from any to any port 5544
ufw allow proto tcp from any to any port 5544
ufw reload
Copy Code
Log Server – Check Logstash Is Running
Assuming the config is correct, we may want to make sure that
logstash
Copy Code
is running:
RHEL 6|CentOS 6|Ubuntu 14
service logstash status
Copy Code
RHEL 7|CentOS 7|Debian|Ubuntu 16/18
systemctl status logstash.service
Copy Code
We can expect an output similar to this if it is running:
Logstash Daemon (pid 1171) is running…
Copy Code
We can expect an output similar to this if it is not running:
Logstash Daemon is stopped
Copy Code
If it is not running, we should start it:
RHEL 6|CentOS 6|Ubuntu 14
service logstash start
Copy Code
RHEL 7|CentOS 7|Debian|Ubuntu 16/18
systemctl start logstash.service
Copy Code
Log Server – Check Log Server Is Listening
We want to make sure that the server is listening to port 5544. To check, we execute the following command:
netstat -nal | grep 5544
Copy Code
We can expect an output similar to:
tcp 0 0 :::5544 :::* LISTEN
tcp 0 0 ::1:56104 ::1:5544 ESTABLISHED
tcp 0 0 ::1:5544 ::1:56104 ESTABLISHED
udp 0 0 :::5544 :::*
Copy Code
If it was not listening, then there would be no output to that command or the TCP ports would not appear.
Log Server – Search Log Server Dashboard
To confirm the logs are being received, we can search for the logs in the dashboard.
Initially, we log into the Log Server and click the Dashboards menu.
In the default dashboard we can search for the test logs we generated.
In the Query field type:
TroubleshootingTest
Copy Code
Press Enter and we should see the results below in the “Events Over Time” and “All Events” panels:
Log Server – Check Logstash Log
If we are still not seeing anything in the default dashboard we can check the
logstash
Copy Code
log file. Usually, nothing logs in here unless something goes wrong.
To check, we execute the following command:
tail -f /var/log/logstash/logstash.log
Copy Code
Log Server – Logs Appear A Few Hours Later
We do not see the logs in the default dashboard until a few hours after they were sent. In some situations, the date and time are not set correctly on all the Nagios Log Server nodes.
To ensure that the cluster timezone settings are correct, we follow the steps given below:
- Log into Nagios Log Server
- In the top menu bar click Admin
- Under General click Global Settings
- Here we can define the Cluster Timezone
- If it is not correct, select the timezone and click Save Settings
Log Server – Disable Filters
An incorrect filter can cause logs to not process by Log Server. A useful troubleshoot is to disable any extra filters we have added and see if the logs start appearing.
- Log into Log Server and click Configure
- Under Global (All Instances) click Global Config
- On the right side of the screen is the Filters section
- The default filter included in Nagios Log Server is Apache
- Disable any other filters we have added by clicking the Active icon (it will turn into Inactive)
- Click the Save & Apply button at the top
Once we have disabled the filters, we go to the Dashboards and see if logs start appearing.
We will need to go through the process of enabling filters one-by-one (Save & Apply) each time until we identify the filter that is causing the issue. Once we know what filter is causing the issue, we can investigate why there is an issue with this filter.
[Error continues to prevail? We’d be happy to assist!]
Conclusion
To conclude, Logs Not Searchable or Not Coming In Nagios Log Server occur while we run a query in a dashboard. Today, we saw effective methods our Support Engineers employ in order to fix this error.
0 Comments