On Fri, Jun 25, 2010 at 9:04 AM, Emmanuel Noobadmin centos.admin@gmail.com wrote:
I'm wondering if virtualization could be used as a cheap redundancy solution for situations which can tolerate a certain amount of downtime.
Current recommendations is to run some kind of replication server such as DRBD. The problem here is cost if there are more than one server (or servers running on different OS) to be backed up. I'd basically need to tell my client they need to buy say 2X machines instead of just X. Not really attractive :D
So I'm wondering if it would be a good, or likely stupid idea, to run X CentOS machines with VMware. Each running a single instance of CentOS and in at least one case of Windows for MSSQL.
So if any of the machines physically fails for whatever reasons not related to disk. I'll just transfer the disk to one of the surviving server or a cold standby and have things running again within say 30~60 minutes needed to check the filesystem, then mount and copy the image.
I thought I could also rsync the images so that Server 1 backs up Server 2 image file and Server 2 backs up Server 3 etc in a round robin fashion to make this even faster. But reading up indicates that rsync would attempt to mirror the whole 60gb or 80gb image on any change. Bad idea.
So while this is not real time HA but in most situations, they can tolerate an hour's downtime. The cost of the "redundancy" also stays constant no matter how many servers are added to the operation.
Any comments on this or is it like just plain stupid because there are better options that are equally cost effective?
This is one of the advantages of using VMs, and I'm sure most people are using it for this reason in one way or another. However, there are a few things you need to worry about:
- When the host crashes, the guests will also, so you'll be in a recovery situation just like for a physical crash. This is manageable and something you'd have to deal with either way.
- Rsyncing the VMs while they are running leaves them in an inconsistent state. This state may or may not be worse than a simple crash situation. One way I have been getting around this is by creating a snapshot of the VM before performing the rsync, and when bringing up the copy after a crash, revert to the snapshot. That will at least give you consistent filesystem and memory state, but could cause issues with network connections. I usually reboot the VM cleanly after reverting to the snapshot.
Rsync will not transfer the entire file when transferring over the network. It scans the whole thing and only sends changes. If you have --progress enabled it will appear to go through the whole file, but you will see the "speedup" go much higher than a regular transfer. However, sometimes this process can take more time than doing a full copy on a local network. Rsync is meant to conserve bandwidth, not necessarily time. Also, I suggest the you use a GB network if you have the option. If not you could directly link the network ports on 2 servers and copy straight from 1 to the other.
If you are looking at VMware Server for this, here are some tips: - For best performance, search around for "vmware tmpfs". It will dramatically increase the performance of the VMs at the expense of some memory. - VMware Server seems like it's EOL, even though vmware hasn't specifically said so yet - There is a bug in VMware with CentOS that causes guests to slowly use more CPU until the whole machine is bogged down. This can be fixed by restarting or suspend/resume each VM - At this point I'd look at ESXi for the free VMware option.