[CentOS] Virtualization as cheap redundancy option?

On Fri, Jun 25, 2010 at 9:04 AM, Emmanuel Noobadmin
<centos.admin at gmail.com> wrote:
> I'm wondering if virtualization could be used as a cheap redundancy
> solution for situations which can tolerate a certain amount of
> downtime.
>
> Current recommendations is to run some kind of replication server such
> as DRBD. The problem here is cost if there are more than one server
> (or servers running on different OS) to be backed up. I'd basically
> need to tell my client they need to buy say 2X machines instead of
> just X. Not really attractive :D
>
> So I'm wondering if it would be a good, or likely stupid idea, to run
> X CentOS machines with VMware. Each running a single instance of
> CentOS and in at least one case of Windows for MSSQL.
>
> So if any of the machines physically fails for whatever reasons not
> related to disk. I'll just transfer the disk to one of the surviving
> server or a cold standby and have things running again within say
> 30~60 minutes needed to check the filesystem, then mount and copy the
> image.
>
> I thought I could also rsync the images so that Server 1 backs up
> Server 2 image file and Server 2 backs up Server 3 etc in a round
> robin fashion to make this even faster. But reading up indicates that
> rsync would attempt to mirror the whole 60gb or 80gb image on any
> change. Bad idea.
>
> So while this is not real time HA but in most situations, they can
> tolerate an hour's downtime. The cost of the "redundancy" also stays
> constant  no matter how many servers are added to the operation.
>
> Any comments on this or is it like just plain stupid because there are
> better options that are equally cost effective?
>

This is one of the advantages of using VMs, and I'm sure most people
are using it for this reason in one way or another.  However, there
are a few things you need to worry about:

- When the host crashes, the guests will also, so you'll be in a
recovery situation just like for a physical crash.  This is manageable
and something you'd have to deal with either way.

- Rsyncing the VMs while they are running leaves them in an
inconsistent state.  This state may or may not be worse than a simple
crash situation.  One way I have been getting around this is by
creating a snapshot of the VM before performing the rsync, and when
bringing up the copy after a crash, revert to the snapshot.  That will
at least give you consistent filesystem and memory state, but could
cause issues with network connections.  I usually reboot the VM
cleanly after reverting to the snapshot.

Rsync will not transfer the entire file when transferring over the
network.  It scans the whole thing and only sends changes.  If you
have --progress enabled it will appear to go through the whole file,
but you will see the "speedup" go much higher than a regular transfer.
 However, sometimes this process can take more time than doing a full
copy on a local network.  Rsync is meant to conserve bandwidth, not
necessarily time.  Also, I suggest the you use a GB network if you
have the option.  If not you could directly link the network ports on
2 servers and copy straight from 1 to the other.

If you are looking at VMware Server for this, here are some tips:
- For best performance, search around for "vmware tmpfs".  It will
dramatically increase the performance of the VMs at the expense of
some memory.
- VMware Server seems like it's EOL, even though vmware hasn't
specifically said so yet
- There is a bug in VMware with CentOS that causes guests to slowly
use more CPU until the whole machine is bogged down.  This can be
fixed by restarting or suspend/resume each VM
- At this point I'd look at ESXi for the free VMware option.