EC2Rescue to troubleshoot operating-system-level issues

by Nicky Mathew | Published on July 11, 2021 | Updated on July 11, 2021

EC2Rescue is a tool to troubleshoot operating-system-level issues on Amazon EC2 Linux instances.

Here, at Bobcares, we assist our customers with several AWS queries as part of our AWS Support Services.

Today, let us see how to use EC2Rescue to diagnose and troubleshoot problems.

EC2Rescue

With EC2Rescue, we can correct operating-system-level issues.

It can also collect advanced logs, system utilization reports, and configuration files, in case we need to analyze.

EC2Rescue addresses the following for Linux:

Collect system utilization reports.
Collect logs and details.
Detect system problems.
Automatically remediate system problems.

EC2Rescue to troubleshoot operating-system-level issues

Moving ahead, let us see how to troubleshoot an unreachable Amazon EC2 Linux instance.

To do so, our Support Techs recommend the steps below:

1. Initially, we launch a new Amazon EC2 instance in the virtual private cloud (VPC) using the same Amazon Machine Image (AMI) and in the same Availability Zone as the impaired instance.

The new instance becomes the rescue instance.

Another option is to use an existing instance that we can access if it uses the same AMI and is in the same Availability Zone as the impaired instance.

2. Then we detach the Amazon Elastic Block Store root volume (/dev/xvda or /dev/sda1) from the impaired instance.

3. We then attach the EBS volume as a secondary device ( /dev/sdf) to the rescue instance.

4. Eventually, we connect to the rescue instance via SSH.

5. Here, we create a mount point directory (/rescue) for the new volume we attach to the rescue instance.

$ sudo mkdir /rescueCopy Code

6. We mount the volume at the above directory.

$ sudo mount /dev/xvdf1 /rescueCopy Code

We can use the lsblk command to view the available disk devices along with their mount points.

Suppose the volume mount fails. Then, we check dmesg | tail. If the logs suggest conflicting UUID, we use the option -o nouuid.

7. Now we change the root directory (chroot) to the new volume:

$ sudo -i
# for i in proc sys dev run; do mount --bind /$i /rescue/$i ; done
# chroot /rescueCopy Code

8. After that, we download and install the EC2Rescue Tool for Linux on an offline Linux root volume:

$ curl -O https://s3.amazonaws.com/ec2rescuelinux/ec2rl.tgz
$ tar -xvf ec2rl.tgzCopy Code

9. By listing the help file, we can verify the installation:

$ cd ec2rl-<version_number>
$ ./ec2rl helpCopy Code

10. We then proceed to run EC2Rescue for Linux with no options to run all modules as sudo:

$ sudo ./ec2rl runCopy Code

11. The result will be in /var/temp/ec2rl:

cat /var/tmp/ec2rl/<logfile_location>/Main.logCopy Code

12. After analyzing the results we enable remediation for the supported modules:

$ ./ec2rl run --remediateCopy Code

13. Once done, we exit from chroot and unmount the secondary device:

$ exit
$ sudo umount /rescueCopy Code

If the unmount isn’t successful, we stop or reboot the rescue instance to enable a clean unmount.

14. Then we detach the secondary volume (/dev/sdf) and then attach it to the original instance as /dev/xvda (root volume).

15. Eventually, we start the EC2 instance, and verify the instance is responsive.

[Stuck with the steps? Feel free to contact us anytime]

Conclusion

In short, we saw how our Support Techs use EC2Rescue to correct operating-system-level issues.

var google_conversion_label = "owonCMyG5nEQ0aD71QM";

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

0 Comments

Submit a Comment Cancel reply

Top Categories

Speed issues driving customers away?
We’ve got your back!

OPTIMIZE TODAY

Software Development

Server Management

EC2Rescue to troubleshoot operating-system-level issues

EC2Rescue

EC2Rescue to troubleshoot operating-system-level issues

9. By listing the help file, we can verify the installation:

Conclusion

PREVENT YOUR SERVER FROM CRASHING!

0 Comments

Submit a Comment Cancel reply

Related Articles

Top Categories

INFORMATION

LATEST BLOG POSTS