“502 bad gateway” in Nginx – Top 5 reasons for it, and how to resolve it
Maintaining a server is hard.
You have to deal with all the upgrades, security patches and the occassional server errors (aka errors from hell).
One such common error in Nginx servers is “502 Bad Gateway“.
The error message is cryptic.
So, many web masters roll up their sleeves and look at the error log:
2017/04/04 08:34:43 [error] 949#949: *7 connect() failed (111: Connection refused) while connecting to upstream, client: XXX.XXX.XXX.XXX, server: myserver.com, request: "GET /myurl-this/ HTTP/1.0", subrequest: "/redis-fetch", upstream: "redis://127.0.0.1:6379", host: "refserver.com", referrer: "http://referalsite.com/myurl-this/"
Yeah, more gibberish.
You know something is messed up, because it says “failed” and “refused“.
But WHAT? You hardly have time to get a PhD in computer science.
If you’re like most web owners, this is the point when you contact your hosting provider and wait in line for hours to get a resolution.
But, do you have hours to wait?
Bobcares Vs 502 Bad Gateway error and others
You don’t have the luxury of time when your site is down. You need it back online ASAP.
You need an experienced expert to go through the various possibilities (service down, firewall issues, DDoS, etc.), and figure out a solution quickly.
And that’s exactly what our Dedicated Server Administrators do day in an day out.
Our server admins form the tech support team of several business websites (like eCommerce stores, online publishers, etc.), and provide 24/7 monitoring & support.
If a customer site goes down, we login to the server, zoom in on the root cause, restore the site, and implement preventive fixes – all before the server users come to know about it.
Today we’ll take a look at the various reasons we’ve seen for 502 Bad Gateway error, and how we’ve fixed them.
1. Backend service failed
Nginx depends on backend services like PHP-FPM, database services and cache servers to run web applications.
So, if any of these services crash or freeze, Nginx won’t get any data from them, resulting in “502 Bad gateway” error.
Some services that we’ve seen to fail are:
The reasons for service failure can range from traffic spikes and resource limits to disk errors and DDoS attacks.
If you suspect a backend service is unresponsive or failed, you can try killing all unresponsive processes and restarting the service.
For instance, here’s one way we kill defunct PHP-FPM processes and restart services.
# kill -9 $(pgrep php-fpm)
# /etc/init.d/php-fpm restart
* Restarting PHP FastCGI Process Manager php-fpm [ OK ]
Warning : Do not use these commands if you are not sure how it works.
If the service restart didn’t work, you may need to get someone to take a closer look at the server health.
Our Nginx experts are online 24/7. Click here if you need help resolving your server error.
2. High server load
The second most common reason for “502 bad gateway” in Nginx is high load average in backend servers.
Load spikes cause services to not respond.
We’ve seen these reasons for load spikes:
- Sudden spike in website traffic (can be seasonal or marketing / promotional).
- Malware infection on the server.
- Comment spamming or other vulnerability exploits.
- Brute force attacks that’s designed to exploit web apps.
- Application bugs that cause memory leaks or resource hogging.
To troubleshoot a high load issue, first we figure out which resource is being abused (I/O, Memory, CPU or Net).
The we find out which service is abusing that resource, and from that point, find out which user in that service owns the abusive script or software.
If your server is currently under high load, and you need urgent help, click here to contact our Emergency Server Support techs. We are online 24/7 and can help you in a few minutes.
3. Incorrect service configuration
Your Nginx server and the backend services relies on many sub-systems to work properly.
This includes DNS resolution, Apache processes, PHP services, DB server, etc.
If even one of these services have a wrong config entry, that service will fail to respond, and Nginx will show “502 bad gateway” error.
Some configuration issues that we’ve seen are:
- DNS resolver misconfigured in Nginx causing domain lookups to fail.
- DB login details set incorrectly after a recent migration, restore or upgrade.
- Apache firewall settings (mod_security) syntax error causing Apache to crash.
- Incorrect memory or file limits set for PHP applications.
- Capacity limits (like no: of connections per IP) set too restrictively causing legit visits to fail.
- ..and more
There is no easy way to find out a configuration error.
You really need to scan the error log and pay attention to what the error says.
For eg. this error here says the PHP application reached the maximum limit of processes (defined by
pm.max_children setting) allowed.
WARNING: [mysite.com] server reached max_children setting (30), consider raising it
ERROR: unable to read what child say: Bad file descriptor (9)
If you are not familiar with PHP or web server settings, it is best to ask a server administrator.
If you need help fixing a similar error, click here to talk to our Nginx admins. We are online 24/7 and can attend your ticket within a few mins.
How Bobcares prevents configuration errors
As a quick aside, here’s how we prevent server errors related to config issues.
Configuration errors are generally caused by stale server settings that’s not adjusted for new traffic or site upgrades.
That is why Dedicated Server Admins audit our customer servers at least once a month.
During this audit, we detect possible performance bottlenecks, security loopholes and hardware issues.
This helps us to proactively resolve potential issues, rather than reacting to a downtime once an error has happened.
4. Service port blocked in firewall
Firewalls are the bedrock of server security. But if not setup right, these firewalls can cause legitimate requests to be blocked or services to fail.
For instance, in Linux servers that run Plesk automation suite, Nginx runs on port 80, and Apache runs on port 7080.
But firewalls by default block uncommon ports such as 7080, and it will result in Nginx unable to connect to Apache.
Result? 502 Bad Gateway error.
Such issues often happens when a new service is enabled (eg. caching server, Ruby, etc.) in the backend, or during a migration, or after a server upgrade.
To fix it, we look at what port each service runs on using a command like this:
# netstat -lpn
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 19785/nginx
tcp6 0 0 :::80 :::* LISTEN 19785/nginx
and if we find any service running in non-standard ports, we either change the service configuration to change it to a standard port, or edit firewall config to allow the non-standard port.
5. Web application bugs
A rare case for “502 Bad Gateway” error is application code error.
If your web server logs show a scary looing error like this, it is possible that our application code is incompatible with the server version.
[notice] child pid 27831 exit signal Segmentation fault (11)
You’ll need to inspect the software requirements of your application, and re-configure the services to match the required versions.
If you’re facing this issue right now, our Nginx experts can help you in a few minutes. Click here to open a support request. We are online 24/7.
502 Bad Gateway in Nginx commonly occurs when Nginx runs as a reverse proxy, and is unable to connect to backend services. This can be due to service crashes, network errors, configuration issues, and more. Today we’ve seen the top 5 causes for this error, and how to fix it.