Conditions on Azure Cache for Redis or the virtual machines hosting it at times triggers issues like memory pressure, high CPU usage etc.
As a part of our Server Management Services, we help our Customers to fix similar Redis related errors regularly.
Let us today discuss the possible methods to troubleshoot Azure Cache for Redis server-side issues.
How to troubleshoot Azure Cache for Redis server-side issues?
Some condition on Azure Cache for Redis or the virtual machines hosting it can trigger server side issues. This include:
- Memory pressure on Redis server
- High CPU usage or server load
- Long-running commands
- Server-side bandwidth limitation
Let us now look at the steps to troubleshoot these issues in detail.
Memory pressure on Redis server
As Redis is an in-memory data structure storage, the memory pressure on the server side affects performance of the server.
When memory pressure hits, the system may page data to disk. This causes the system to slow down significantly. Possible causes of this memory pressure include:
- The cache data near its maximum capacity.
- Redis is seeing high memory fragmentation. Storing large objects often cause this fragmentation as since Redis is optimized for small objects.
The INFO command in Redis returns information and statistics about the server. It also include the values “used_memory” and “used_memory_rss ” which provide us an overview of the memory usage.
Let us now discuss some possible changes we can make to help keep memory the usage normal:
- Configure a memory policy and set expiration times on your keys. This policy may not be sufficient if you have fragmentation. The detailed steps to perform it is available here.
- Configure a maxmemory-reserved value that is large enough to compensate for memory fragmentation.
- Break up your large cached objects into smaller related objects.
- Create alerts on metrics like used memory to be notified early about potential impacts. Azure Monitor allows you to configure an alert to do to send an email notification, call a webhook and Invoke an Azure Logic App. To configure Alert rules for the cache, click Alert rules from the Resource menu.
- Scale to a larger cache size with more memory capacity.
High CPU usage or server load
A high server load or CPU usage indicates that the server cannot process requests in a timely fashion. The server may be slow to respond and unable to keep up with request rates.
There are several changes we can make to mitigate high server load:
- Investigate what is causing CPU spikes such as long-running commands or page faulting because of high memory pressure.
- Create alerts on metrics like CPU or server load to be notified early about potential impacts.
- Scale to a larger cache size with more CPU capacity.
Long-running commands
Redis command processing is single-threaded. That is, a command that takes time to run will block all others that come after it. Some Redis commands are more resource intensive than others.
We should review the commands that we issue to the Redis server to understand their performance impacts. For instance, the KEYS command is often used without knowing that it’s an O(N) operation. We can avoid KEYS by using SCAN to reduce CPU spikes.
Further, using the SLOWLOG command, we can measure expensive commands being executed against the server.
Server-side bandwidth limitation
Different cache sizes have different network bandwidth capacities. If the server exceeds the available bandwidth, then data won’t be sent to the client quickly. Clients requests could time out because the server cannot push data to the client fastly.
The “Cache Read” and “Cache Write” metrics shows the server-side bandwidth usage. We can view these metrics in the Azure portal.
Thus, to mitigate situations where network bandwidth usage is close to maximum capacity use the options below:
- Change client call behavior to reduce network demand.
- Create alerts on metrics like cache read or cache write to be notified early about potential impacts.
- Scale to a larger cache size with more network bandwidth capacity.
[Need any further assistance in fixing Redis errors? – We’re available 24*7]
Conclusion
In short, Conditions on Azure Cache for Redis or the virtual machines hosting it at times triggers issues like memory pressure, high CPU usage etc on server side. Today, we saw how our Support Engineers mitigate these errors.
0 Comments