How to troubleshoot server down issues

Please Note: This article is part of our historical archive. Because it was published a while ago, some of the information, links, or context may now be outdated.

How to troubleshoot server down issues? Let’s discuss.

At times, network issues might occur and all the servers in a Datacenter can go down. This can lead us into some unlucky instances.

As part of our Server Management Services, we assist our customers with several server queries.

Today, let us see how we can troubleshoot server down issues.

How to troubleshoot server down issues

Initially, we need to make sure if it is a false alert.

To do so, we connect to the server via ping or telnet to any of the running ports and check if it is really down or not.

The commands are really simple:

ping server ip
telnet serverip port number

For example,

ping 186.65.23.15
telnet 186.65.23.15 22

The above commands can perform in different operating systems such as Linux, Mac, or Windows.

If the server responds to the ping fine without any data loss as given below, then everything is fine. It is a false alert.

-
ping google.com -c10
PING google.com (74.125.129.100) 56(84) bytes of data.
10 packets transmitted, 10 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 326.477/326.477/326.477/0.000 ms
-

However, if the server is not responding to ping or has any packet loss, then our Support Techs recommend contacting DC or hosting providers to get the issue sorted out.

Let us see what one of our customers came across.

When he received the server down alert, he couldn’t access the server. Even a reboot couldn’t bring the server back online.

So, we connect to the server via IPMI and found that the server was stuck at fsck and that led to the problem.

There can be so many reasons for the server being down.

High load in the server
Faulty equipment
High temperature in the DC room
Partition being full
No disk space in the server
Power connection cable

If the server is stuck, a reboot from DC is a better option.

Once the server is up, we need to check the reason why it went down.

To find the reason, we refer to the below logs location in the server.

For Linux:

/var/log/messages —
dmesg |grep less
/var/log/boot.log
/var/log/fsck

For Windows:

Start>>Run>>eventvwr

Suppose, the load was high due to spam or the number of incoming connections to HTTP being high, we need to troubleshoot accordingly.

However, if the server was down due to a high load, then we check the incoming connections and block them in the server.

If a drive is faulty, then we need to replace them. In our case, the server was running fsck check and that is why it was taking time.

Several reasons may lead to fsck check running:

The complete unmounting ability of the hard disk
Using a third-party utility to delete the extended partition
Problems with any filesystems
Power failure
Incomplete shut down
Hardware failure

The above causes result in file system operations being incomplete.

The few sample logs we can find in the Linux logs are as below.

Checking all file systems.
[/sbin/fsck.ext3 (1) — /] fsck.ext3 -a /dev/xvda1
/: clean, 56079/1310720 files, 1243508/2621440 blocks
[/sbin/fsck.ext3 (1) — /var/www/virtual] fsck.ext3 -a /dev/sdf
fsck.ext3: No such file or directory while trying to open /dev/sdf
/dev/sdf:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193

After the fsck check is complete, the server will be up fine without further errors.

[Stuck with server down? We can help you through it]

Conclusion

In short, network issues might cause the Datacenter to go down. This can lead us into some unlucky instances. However, today we saw how to troubleshoot server down issues.

How to troubleshoot server down issues

How to troubleshoot server down issues

There can be so many reasons for the server being down.

Submit a Comment Cancel reply

Subscribe to our newsletter

Footer newsletter