Cloud servers like EC2 make web solutions easy and cheap. However, connectivity problems can often make EC2 instance not reachable.
And, it can be really frustrating when you are doing a last minute website edit.
That’s why, we often get requests from our AWS customers to fix EC2 access problems as part of our Cloud Management Services.
Today, we’ll see the various reasons that can make EC2 instance not reachable and how our Cloud Experts fix it.
How does EC2 unreachable error look?
Firstly, lets check on how EC2 unreachable error look like. An easy way would be to do a ping check on the public IP address or hostname of EC2 instance.
And, when the instance is not reachable, it shows up as :
:~$ ping ec2-xx-172-249-xxx.compute-1.amazonaws.com
PING ec2-xx-172-249-xxx.compute-1.amazonaws.com (xx-172-249-xxx) 56(84) bytes of data.
64 bytes from ec2-xx-172-249-xxx.compute-1.amazonaws.com (xx-172-249-xxx): Destination host unreachable
64 bytes from ec2-xx-172-249-xxx.compute-1.amazonaws.com (xx-172-249-xxx): Destination host unreachable
64 bytes from ec2-xx-172-249-xxx.compute-1.amazonaws.com (xx-172-249-xxx): Destination host unreachable
64 bytes from ec2-xx-172-249-xxx.compute-1.amazonaws.com (xx-172-249-xxx): Destination host unreachable
What makes EC2 instance unreachable?
Now, its time to see the typical reasons for EC2 unreachable error. Basically, this happens on both Linux or Windows instances. We’ll now check each of the reason in detail.
1. Booting errors
A very common reason for EC2 instance to become not reachable would be the booting errors on the server. In many cases, the operating system experiencing a fatal corruption in the system files, registry corruption, etc. put the instance to a stuck state.
In Linux instances, the boot script goes faulty and make the entire system non-functional. As a result, an attempt to connect to the EC2 instance ends up in Not reachable error.
2. Network configuration error
Similarly, network configuration errors also contribute to a major share of EC2 connectivity errors.
When someone makes a networking change, the instance can be dropped off from the network. As a result, it can cause the instance to be inaccessible. Likewise a common network setting error with EC2 happens when someone sets a static IP address. As per AWS policy, Amazon EC2 ignores static IP address as such. The right way would be to configure a network interface and then attach it to the instance.
Again, when EC2 uses VPN, it should be properly allowed in the network. Else, EC2 will not be reachable too.
3. Firewall restrictions
Yet another reason that make EC2 instance not reachable can be firewall restrictions.
A windows instance need port 3389 open in the security group of the EC2 instance. Or, if the Linux instance has a custom SSH port, that also should be open in the firewall. Additionally, Access Control lists restricting location wise access also create problems with EC2 connection.
4. High CPU
Last and not the least, often high CPU usage on the EC2 instance can make the server unreachable too. From our experience in managing EC2 instances, our Support Engineers see high CPU utilization in cases like Windows update, Security Software scan, etc. Similarly, abuse scripts or poorly managed scripts can also shoot up CPU usage and make instance unreachable.
How to fix EC2 instance not reachable error?
We just saw the different causes for EC2 unreachable error. Its time now to take a look on how our Cloud Engineers fix EC2 instance not reachable error.
1. Status of EC2 instance
When the EC2 instance is not reachable, we begin troubleshooting by checking the status of the instance from Amazon EC2 dashboard. This quickly gives and idea whether the server is in running status or not. If it is in stopped state, a restart would easily make it available again.
Moreover, in this step, we check for any DNS errors on the public IP that would make the instance unreachable.
2. Screenshot of unreachable instance
Unfortunately, when EC2 is already running, but not reachable, it requires deeper troubleshooting. Let’s see how our Dedicated Engineers make use of the screenshot utility in EC2 panel.
When the Linux or Windows instance is not reachable through SSH or RDP, we capture a screenshot of the instance and view it as an image. For this, we select the problem instance and click on Instance screenshot as shown below.
This provides immediate visibility to the exact status of the instance. For example, here the server shows up the login screen. This means that the problem could be with firewall settings that restrict connection on the instance.
In cases where the server is on a Windows update screen or in Recovery console, we just have to follow the steps to complete the task. This allows quicker troubleshooting and saves time for the real fix.
3. Correcting network configuration
The basic technique to recover the failed instances due to network errors is to mount the problem server’s root file system on another server. This make the faulty file system data accessible on another server.
Here, from the AWS panel, we detach the drive from the faulty instance. Then, we attach it to a working instance. Then we connect to the debug instance using SSH as ec2-user. To mount the corrupted drive /dev/xvdf, we use the commands:
$ cd /
$ sudo mkdir faultydrive
$ sudo mount /dev/xvdf /faultydrive
This make the files available in the directory /faultydrive. As a result, we edit the network configuration files at /faultydrive/etc/sysconfig/network-scripts.
Finally, we detach the disk and attach it back to the faulty instance. With proper correction in the network settings, the EC2 instance will be reachable once again.
4. Fixing firewall
In many cases of firewall restrictions, we just need to add the proper rules to the security group. For example, when one of our customers reported EC2 connectivity problem via SSH, we just had to allow the custom SSH port in the firewall. We added the custom TCP rule in the Inbound section from the EC2 dashboard to allow traffic on port 3725. And the final firewall settings looked as below.
Similarly, to track the CPU Utilization on the instance, we make use of Amazon CloudWatch feature. We, then analyze the statistics for a specific resource. This give us a clear indication of the exact problem and we fix it.
[Trouble when trying to connect to an EC2 instance? Our AWS experts can fix it for you.]
Conclusion
To be precise, EC2 instance not reachable error commonly occurs due to booting errors, firewall restrictions, high CPU usage, etc. Today, we saw how our Cloud Engineers take a screenshot of faulty instance, mount the disk and make the instance working again.
Good one mate. Wonderful explanation. Kudos to you !!!