On 22/06/16 02:10 AM, Tom Robinson wrote:
Hi Digimer,
Thanks for your reply.
On 22/06/16 15:20, Digimer wrote:
On 22/06/16 01:01 AM, Tom Robinson wrote:
Hi,
I have two KVM hosts (CentOS 7) and would like them to operate as High Availability servers, automatically migrating guests when one of the hosts goes down.
My question is: Is this even possible? All the documentation for HA that I've found appears to not do this. Am I missing something?
Very possible. It's all I've done for years now.
https://alteeve.ca/w/AN!Cluster_Tutorial_2
That's for EL 6, but the basic concepts port perfectly. In EL7, just change out cman + rgmanager for pacemaker. The commands change, but the concepts don't. Also, we use DRBD but you can conceptually swap that for "SAN" and the logic is the same (though I would argue that a SAN is less reliable).
In what way is the SAN method less reliable? Am I going to get into a world of trouble going that way?
In the HA space, there should be no single point of failure. A SAN, for all of it's redundancies, is a single thing. Google for tales of bad SAN firmware upgrades to get an idea of what I am talking about.
We've found that using DRBD and build clusters in pairs to be a far more resilient design. First, you don't put all you eggs in one basket, as it were. So if you have multiple failures and lose a cluster, you lose one pair and the servers it was hosting. Very bad, but less so that losing the storage for all your systems.
Consider this case that happened a couple of years ago;
We had a client, through no malicious intent and misunderstanding of how "hot swap" worked, walk up to a machine and start ejecting drives. We got in touch and stopped him in very short order, but the damage was done and the node's array was hosed. Certainly not a failure scenario we had ever considered.
DRBD (which is sort of like "RAID 1 over a network") simply market the local storage as Diskless and routed all read/writes to the good peer. The hosted VM servers (and the software underpinning them) kept working just fine. We lost the ability to live migrate because we couldn't read from the local disk anymore, but the client continued to operate for about an hour until we could schedule a controlled reboot to move the servers without interrupting production.
In short, to do HA right, you have to be able to look at every piece of you stack and say "what happens if this goes away?" and design it so that the answer is "we'll recover".
For clients who need the performance of SANs (go big enough and the caching and whatnot of a SAN is superior), we then recommend two SANs and connect each one to a node and then treat them from there up as DAS.
There is an active mailing list for HA clustering, too:
I've had a brief look at the web-site. Lots of good info there. Thanks!
Clusterlabs is now the umbrella for a collection of different open source HA projects, so it will continue to grow as time goes on.
My configuration so fare includes:
- SAN Storage Volumes for raw device mappings for guest vms (single volume per guest).
- multipathing of iSCSI and Infiniband paths to raw devices
- live migration of guests works
- a cluster configuration (pcs, corosync, pacemaker)
Currently when I migrate a guest, I can all too easily start it up on both hosts! There must be some way to fence these off but I'm just not sure how to do this.
Fencing, exactly.
What we do is create a small /shared/definitions (on gfs2) to host the VM XML definitions and then undefine the VMs from the nodes. This makes the servers disappear on non-cluster aware tools, like virsh/virt-manager. Pacemaker can still start the servers just fine and pacemaker, with fencing, makes sure that the server is only ever running on one node at a time.
That sounds simple enough :-P. Although, I wanted to be able to easily open VM Consoles which I do currently through virt-manager. I also use virsh for all kinds of ad-hoc management. Is there an easy way to still have my cake and eat it? We also have a number of Windows VM's. Remote desktop is great but sometimes you just have to have a console.
We use virt-manager, too. It's just fine. Virsh also works just fine. The only real difference is that once the server shuts off, it "vanishes" from those tools. I would say about 75%+ of our clients run some flavour of windows on our systems and both access and performance is just fine.
We also have an active freenode IRC channel; #clusterlabs. Stop on by and say hello. :)
Will do. I have a bit of reading now to catch up but I'm sure I'll have a few more questions before long.
Kind regards, Tom
Happy to help. If you stop by and don't get a reply, please idle. Folks there span all timezones but are generally good about reading scroll-back and replying.