Stuck with the error, NDOUtils: Message Queue Exceeded? We can help you.
NDOUtils uses the operating system kernel message queue. As the amount of messages increases, we need to tune the kernel settings need to allow more messages to queue and process.
As part of our Server Management Services, we assist our customers with several NDOUtils queries.
Today, let us see how to resolve this error.
What are the common Symptoms?
In Nagios, we may experience the following symptoms:
- Missing hosts or services or status data
- Long time to restart the Nagios process
- Unusually high CPU load
- A flood of messages in the /var/log/messages related to ndo2db like:
ndo2db: Error: max retries exceeded sending message to queue. Kernel queue parameters may need to be tuned. See README. ndo2db: Warning: queue send error, retrying…
In addition, we may see multiple queues for the Nagios user while executing:
ipcs -q
—— Message Queues ——– key msqid owner perms used-bytes messages 0xee070002 1409024 nagios 600 100672512 98313 0x50070002 1441793 nagios 600 0 0
How to resolve NDOUtils: Message Queue Exceeded?
Initially, we need to identify the current values:
grep ‘kernel.msgmnb’ /etc/sysctl.conf
grep ‘kernel.msgmax’ /etc/sysctl.conf
grep ‘kernel.msgmni’ /etc/sysctl.conf
kernel.msgmnb = 131072000 kernel.msgmax = 131072000 kernel.msgmni = 256000
If the settings are not already defined, then there will be no output for that command. So it will need to be defined in the /etc/sysctl.conf file.
For msgmnb and msgmax, we need to use the same value. Recommended values are 131072000 and 262144000
On the other hand, for msgmni, we recommend 512000.
Unless we have a high-performance server, values higher than these will not be a solution.
For msgmnb and msgmax, the following commands will update /etc/sysctl.conf with increased values.
sed -i ‘s/^kernel\.msgmnb.*/kernel\.msgmnb = 262144000/g’ /etc/sysctl.conf
sed -i ‘s/^kernel\.msgmax.*/kernel\.msgmax = 262144000/g’ /etc/sysctl.conf
The below are for the msgmni option. For the grep command we executed previously:
- If it does not return output, this command will add the setting to the /etc/sysctl.conf file:
echo ‘kernel.msgmni = 512000’ >> /etc/sysctl.conf
- On the other hand, if it does, this command will update the setting in the /etc/sysctl.conf file:
sed -i ‘s/^kernel\.msgmni.*/kernel\.msgmni = 512000/g’ /etc/sysctl.conf
Once done, we execute the following command:
sysctl -p
net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 kernel.msgmnb = 262144000 kernel.msgmax = 262144000 kernel.shmmax = 4294967295 kernel.shmall = 268435456 kernel.msgmni = 512000
The output shows increased values have been applied to the kernel.
Then we need to restart services:
- RHEL 6|CentOS 6|Oracle Linux 6|Ubuntu 14
service nagios stop service ndo2db restart service nagios start
- RHEL 7|CentOS 7|Oracle Linux 7|Debian|Ubuntu 16/18
systemctl stop nagios.service systemctl restart ndo2db.service systemctl start nagios.service
Eventually, we should check the message queues:
ipcs -q
If we see more than one queue for the user Nagios, execute the following to clear them:
for i in `ipcs -q | grep nagios |awk ‘{print $2}’`; do ipcrm -q $i; done
We watch the queues for 10-15 minutes to ensure they process:
watch ipcs -q
To stop watching the queues, we hit Ctrl + C.
If we find the message queue does not process quickly, the problem may relate to MySQL/MariaDB.
Ensure that the DB server has enough CPU and memory resources.
In addition, if the DB server is on the same server as the Nagios server, we should look at offloading the DB to a dedicated server.
[Need help with the process? We are available 24/7]
Conclusion
In short, the error, NDOUtils: Message Queue Exceeded error occurs when the amount of messages increases.
0 Comments