Stuck with AWS EC2 kernel panic after system upgrade? We can help you.
At Bobcares we assist our customers with several AWS queries as part of our AWS Support Services for AWS users, and online service providers.
Today, let us discuss how our Support Techs resolved above error.
AWS EC2 kernel panic after system upgrade
The instance fails to boot after a kernel or system upgrade or after a system reboot on Amazon Elastic Compute Cloud (Amazon EC2) instance.
Typical error looks as shown below:
VFS: Cannot open root device XXX or unknown-block(0,0) Please append a correct “root=” boot option; here are the available partitions: Kernel panic – not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
What causes AWS EC2 kernel panic after system upgrade?
Following are some of the causes:
- The initramfs or initrd image is missing from the newly updated kernel configuration in /boot/grub/grub.conf.
- The kernel or system packages weren’t fully installed during the upgrade process due to insufficient space.
- Third-party modules are missing from the initrd or initramfs image. For example, NVMe, LVM, or RAID modules.
How to resolve it?
Let us see the methods used by our Support Techs to resolve the errors.
The initramfs or initrd image is missing from the /boot/grub/grub.conf or /boot directory
Method 1: Use the EC2 Serial Console
- If you’ve enabled EC2 Serial Console for Linux, you can use it to troubleshoot supported Nitro-based instance types.
- The serial console helps you troubleshoot boot issues, network configuration, and SSH configuration issues.
- The serial console connects to your instance without the need for a working network connection.
- You can access the serial console using the Amazon EC2 console or the AWS Command Line Interface (AWS CLI).
- Before using the serial console, grant access to it at the account level.
- Then, create AWS Identity and Access Management (IAM) policies granting access to your IAM users.
- Every instance using the serial console must include at least one password-based user.
- If your instance is unreachable and you haven’t configured access to the serial console, then follow the instructions in Method 2.
Method 2: Use a rescue instance
- Firstly, open the Amazon EC2 console.
- Then, choose Instances from the navigation pane, and then select the impaired instance.
- Choose Actions, Instance State, Stop instance.
- In the Storage tab, select the Root device, and then select the Volume ID.
- Choose Actions, Detach Volume (/dev/sda1 or /dev/xvda), and then choose Yes, Detach.
- Then, verify that the State is Available.
- Launch a new EC2 instance in the same Availability Zone and with the same operating system as the original instance. This new instance is your rescue instance.
- After the rescue instance launches, choose Volumes from the navigation pane, and then select the detached root volume of the original instance.
- Choose Actions, Attach Volume.
- Select the rescue instance ID (1-xxxx) and then enter /dev/xvdf.
- Run the following command to verify that the root volume of the impaired instance attached to the rescue instance successfully:
- Then, create a mount directory and then mount under /mnt.
$ mount /dev/xvdf1 /mnt
- Invoke a chroot environment by running the following command:
$ for i in dev proc sys run; do mount -o bind /$i /mnt/$i; done
- Run the chroot command on the mounted /mnt file system:
$ chroot /mnt
- Run the following commands based on your operating system.
RPM-based operating systems:
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
$ sudo dracut -f -vvvvv
Debian-based operating systems:
$ sudo update-grub && sudo update-grub2
$ sudo update-initramfs -u -vvvvv
- Verify that the initrd or initramfs image is present in the /boot directory and that the image has a corresponding kernel image.
- After verifying that the latest kernel has a corresponding initrd or initramfs image, run the following commands to exit and cleanup the chroot environment:
$ for i in dev proc sys run; do sudo umount /mnt/$i; done
- Detach the root volume from the rescue instance and attach the volume to the original instance.
- Finally, start the original instance.
Third-party modules are missing from the initrd or initramfs image
To determine if the kernel panic error is caused by a missing third-party module or modules, do the following:
1.Use Method 1: Use the EC2 Serial Console in the preceding section to create a chroot environment in root volume of the non-booting instance.
Or follow steps 1-14 in Method 2: Use a rescue instance in the preceding section to create a chroot environment in the root volume of the non-booting instance.
2.Use one of the following three options to determine which module or modules are missing from the initramfs or initrd image:
Option 1: Run the dracut -f -v command in the /boot directory to determine if rebuilding the initrd or initramfs image fails.
Also use the dracut -f -v command to list which module or modules is missing.
Option 2: Run the lsinitrd initramfs-4.14.138-114.102.amzn2.x86_64.img | less command to view the contents of the initrd or initramfs file.
Replace initramfs-4.14.138-114.102.amzn2.x86_64.img with the name of your image.
Option 3: Inspect the /usr/lib/modules directory.
3.If you find a missing module, you can try to add it back to the kernel.
[Need help with AWS error? We’d be happy to assist]
In short, today we saw how our Support techs resolved instance going to emergency mode on booting issue.