How does server virtualisation change traditional high-availability clustering in a data centre? In practice, typical server cluster architecture will often work equally well for virtualised servers, but there are
High-availability clustering brings virtualisation flexibility
First, virtualisation provides greater flexibility in the choice of servers. The software abstraction layer, or hypervisor, supports a virtual machine (VM) instance for the operating systems (OSes) and applications that is decoupled from the underlying hardware -- although powerful, highly reliable servers are still essential in a virtual server environment. They do not need to be identical.
The way data is stored is also different. Nonvirtualised servers may access data from a SAN --even a redundant SAN -- but their OSes and applications are stored locally on each given server. This imposes more traditional backup requirements on those nonvirtualised servers.
In contrast, each VM image is essentially a single data file that can be stored locally but is more often stored on a SAN. VMs are loaded from the SAN into memory on the server where they are executed. Each VM can easily be protected through a periodic or continuous VM snapshot backup process that updates a virtual image file from server memory to a SAN.
VMs can then be backed up from a SAN to another location or storage system without the server I/O load normally associated with traditional backups. An otherwise unreplicated VM from a failed server can be loaded from a SAN onto another available server in a matter of minutes. That new server may be part of the high availability (HA) cluster, but it may also be a regular server with enough computing resources to run a VM.
Another difference is the need for secure intercommunication between VMs, typically handled through virtual LAN, or VLAN, technology.
High-availability clustering tools
Today's virtualisation tools easily complement the traditional HA approach. When an enterprise requires zero downtime, virtual servers use virtualisation-based fault-tolerant VM tools such as Marathon Technologies' everRunVM software running on top of a hypervisor like Citrix XenServer on each member of the server cluster. The HA software is configured to handle selected VMs, which, in high-availability clustering, are then duplicated to another server and synchronised in real time. When the VM stops responding on one server because of a crash or server problem, the duplicate VM resumes operation on the second server. When the original server is recovered, application control shifts to the first VM.
For example, suppose that server A hosts one VM with SQL Server and another VM with an enterprise resource planning (ERP) application. Server B might host one VM with Exchange Server, another VM with a domain name server, and a third VM with a customer relationship management application.
Ordinarily these servers would not be fault tolerant because none of the VMs are duplicated, but an IT administrator can use tools like everRunVM to duplicate the SQL VM between servers A and B. Alternatively, a third server can be added to the cluster solely for redundancy to host the duplicate copy of SQL from server A and a duplicate copy of the Exchange VM from server B and so on.
The reality is that virtualisation doesn't necessarily make HA any more effective than traditional approaches. During VM failover, disruption is negligible with tools like everRunVM. But the flexibility and power of software tools available for virtualisation allow far more effective data protection for secondary applications that in traditional environments might have relied only on conventional tape restoration.
Data centre managers don't need to duplicate every VM; this is one of the core benefits of server virtualisation. VMs that are not duplicated outright can still be protected with continuous snapshots and quickly spun up from the SAN onto other available servers through tools like VMware High Availability.
Remember in the previous example that only the SQL application on server A is duplicated on server B for redundancy. The ERP application on server A is not duplicated. Ordinarily, this would have meant protecting the ERP VM with periodic backups or snapshots. The longer recovery point objectives (RPOs) involved -- that is, the more time between each backup -- would place more data at risk in the event of an outage.
By centralising the ERP and other VMs on the SAN, it's possible to maintain continuous snapshots of each VM from the server, reducing RPO (and potential data loss) to almost zero. If server A fails, the ERP VM can be restored from a SAN to another server with enough available computing power to support the VM. It might cause only several minutes of inaccessibility of the ERP application, which could be entirely acceptable for an organisation. Thus, virtualisation continues to support HA while improving the recoverability of other applications.
Virtual server maintenance with high-availability clustering
Virtualisation also supports tasks like routine maintenance. With traditional high-availability clustering, routine maintenance generally meant taking one of the clustered servers offline for service and leaving the remaining servers at greater risk until the other server was up and resynchronised. If the application were not protected with HA, it would be offline until the server was returned to service.
In a virtualised environment, tools like Microsoft's Hyper-V Live Migration, which is part of Windows Server 2008 R2, or VMware VMotion and Distributed Resource Scheduler (DRS) can shift VMs to other available servers to maintain application availability while the underlying hardware is serviced. The biggest challenge for data centre managers in moving from traditional deployments to virtualised server deployments is implementing the most appropriate level of protection for each application.
Consider a corporate FTP server VM deployed for internal use. It generally wouldn't make sense to duplicate that VM using tools like everRunVM. It would simply consume too many computing resources for noncritical purposes, whereas a few minutes to spin up the VM on another server might be more appropriate.
About the Author
Stephen J. Bigelow, a senior technology writer at TechTarget, has more than 15 years of technical writing experience in the technology industry. He has written hundreds of articles and more than 15 feature books on computer troubleshooting, including Bigelow's PC Hardware Desk Reference and Bigelow's PC Hardware Annoyances. Contact him at email@example.com.
This was first published in March 2010