How we configured disaster recovery in an oVirt system
Life is full of surprises. Some are unpleasant, like a server crash. Disk failures, human errors, external attack or natural calamities often lead to business downtime.
For businesses hosted on dedicated servers, recovering from a crash is tedious, as it often requires installing and configuring the physical servers afresh.
Ever since server virtualization came into the picture, spawning out new virtual machine instances from a host server, is just a matter of a few clicks.
As a result, a crashed server can be restored in a few minutes. Recently we implemented a reliable disaster recovery solution for a VPS hosting provider who was using oVirt server virtualization system.
Built over an open source technology, our backup design provided the customer with significant cost-advantage compared to proprietary backup software. The disaster recovery plan we implemented comprised of 3 stages – backup configuration, monitoring and restoration.
1. Backup configuration
The first stage was design and configuration of the backup process. The backup design consisted of:
- Identifying the files and databases of individual VMs and oVirt server, to ensure that the entire server virtualization solution is readily retrievable.
- Determining the backup type and schedule – daily, weekly or monthly – based on the importance of the data and frequency of its updates. For frequently updated files and databases, we chose daily incremental backups. For the entire data, a full backup was taken on weekly and monthly basis.
- Deciding on the storage location for backups – local or over network – to address various types of disasters. We configured both – local backup in the storage servers and external backups over network to a server in another data center.
- Configuring the storage server – We first determined the quantity of data to be backed up. After considering the frequency of backups, we estimated an approximate storage server capacity. The server was then configured and backup locations were created.
- Scheduling backups – Depending on the business hosted in each VM, we identified the off-peak time and scheduled the backups to happen during that time. This was done to avoid a VM getting overloaded due to backups, which may lead to service interruption.
- Automating backups with custom scripts – To reduce overhead and to save time, we automated the entire backup process using a Python script that connected to the oVirt API and took the backups and copied them over to the backup server. The script also rotated the backups every week to ensure that the disk space of backup servers never exceeded the limits.
[ Looking for custom plugins to manage your portals? Contact us to get tailor-made plugins to serve your business purposes. ]
2. Backup monitoring
After configuring the backups based on our design considerations, we did a trial run to verify if the backup script functioned as intended. We setup an alert notification system to report the status of backup process, to our server monitoring team.
Our team would review the alert and verify if the backup was completed successfully. If the backup process failed due to any reason, they would debug and fix it.
Backup folders in the storage server and external server are audited proactively on a weekly basis by our 24/7 proactive monitoring team. The weekly integrity checks of the backups helped to ensure their adequacy, completeness, accuracy and consistency.
Whenever a new VM was added into the oVirt system, our backup script added this VM to the backup solution. This ensured that our backup was always up-to-date with the latest data.
[ Looking for the WHMCS plugin to manage your oVirt interface? Get our WHMCS plugin for oVirt management here. ]
3. Backup restoration
Implementing a backup solution alone will not guarantee automatic recovery in case of data loss. If the backup is not retrievable at the time of a disaster, that backup solution is of no use.
To verify that the backups served their purpose, we performed a disaster recovery test every month, without affecting the production system.
In our recovery test, the ‘Clone VM from the Snapshot’ feature was used to restore a random VM from its backup. If the restore created a fully functional VM, we considered the test as successful.
The oVirt engine server backup was tested using our script that connected to the oVirt API and verified the data integrity.
Backup solutions should be implemented only after a thorough review of the business goals and server virtualization technology used. Here we’ve covered how we configured a disaster recovery solution for a server virtualization solution that is based on oVirt.
Bobcares helps cloud providers minimize business downtime with our backup management services, which range from formulating the backup and recovery plan to restoring the data within no time.