Il 25/03/2013 17:49, Digimer ha scritto: > On 03/25/2013 08:44 AM, Maurizio Giungato wrote: >> Il 22/03/2013 16:27, Digimer ha scritto: >>> On 03/22/2013 11:21 AM, Maurizio Giungato wrote: >>>> Il 22/03/2013 00:34, Digimer ha scritto: >>>>> On 03/21/2013 02:09 PM, Maurizio Giungato wrote: >>>>>> Il 21/03/2013 18:48, Maurizio Giungato ha scritto: >>>>>>> Il 21/03/2013 18:14, Digimer ha scritto: >>>>>>>> On 03/21/2013 01:11 PM, Maurizio Giungato wrote: >>>>>>>>> Hi guys, >>>>>>>>> >>>>>>>>> my goal is to create a reliable virtualization environment using >>>>>>>>> CentOS >>>>>>>>> 6.4 and KVM, I've three nodes and a clustered GFS2. >>>>>>>>> >>>>>>>>> The enviroment is up and working, but I'm worry for the >>>>>>>>> reliability, if >>>>>>>>> I turn the network interface down on one node to simulate a crash >>>>>>>>> (for >>>>>>>>> example on the node "node6.blade"): >>>>>>>>> >>>>>>>>> 1) GFS2 hangs (processes go in D state) until node6.blade get >>>>>>>>> fenced >>>>>>>>> 2) not only node6.blade get fenced, but also node5.blade! >>>>>>>>> >>>>>>>>> Help me to save my last neurons! >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Maurizio >>>>>>>> >>>>>>>> DLM, the distributed lock manager provided by the cluster, is >>>>>>>> designed to block when a known goes into an unknown state. It does >>>>>>>> not unblock until that node is confirmed to be fenced. This is by >>>>>>>> design. GFS2, rgmanager and clustered LVM all use DLM, so they >>>>>>>> will >>>>>>>> all block as well. >>>>>>>> >>>>>>>> As for why two nodes get fenced, you will need to share more about >>>>>>>> your configuration. >>>>>>>> >>>>>>> My configuration is very simple I attached cluster.conf and hosts >>>>>>> files. >>>>>>> This is the row I added in /etc/fstab: >>>>>>> /dev/mapper/KVM_IMAGES-VL_KVM_IMAGES /var/lib/libvirt/images gfs2 >>>>>>> defaults,noatime,nodiratime 0 0 >>>>>>> >>>>>>> I set also fallback_to_local_locking = 0 in lvm.conf (but nothing >>>>>>> change) >>>>>>> >>>>>>> PS: I had two virtualization enviroments working like a charm on >>>>>>> OCFS2, but since Centos 6.x I'm not able to install it, there is >>>>>>> same >>>>>>> way to achieve the same results with GFS2 (with GFS2 sometime >>>>>>> I've a >>>>>>> crash after only a "service network restart" [I've many interfaces >>>>>>> then this operation takes more than 10 seconds], with OCFS2 I've >>>>>>> never >>>>>>> had this problem. >>>>>>> >>>>>>> Thanks >>>>>> I attached my logs from /var/log/cluster/* >>>>> >>>>> The configuration itself seems ok, though I think you can safely take >>>>> qdisk out to simplify things. That's neither here nor there though. >>>>> >>>>> This concerns me: >>>>> >>>>> Mar 21 19:00:14 fenced fence lama6.blade dev 0.0 agent >>>>> fence_bladecenter result: error from agent >>>>> Mar 21 19:00:14 fenced fence lama6.blade failed >>>>> >>>>> How are you triggering the failure(s)? The failed fence would >>>>> certainly help explain the delays. As I mentioned earlier, DLM is >>>>> designed to block when a node is in an unknowned state (failed but >>>>> not >>>>> yet successfully fenced). >>>>> >>>>> As an aside; I do my HA VMs using clustered LVM LVs as the backing >>>>> storage behind the VMs. GFS2 is an excellent file system, but it is >>>>> expensive. Putting your VMs directly on the LV takes them out of the >>>>> equation >>>> >>>> I used 'service network stop' to simulate the failure, the node get >>>> fenced through fence_bladecenter (BladeCenter HW) >>>> >>>> Anyway, I took qdisk out and put GFS2 aside and now I've my VM on LVM >>>> LVs, I'm trying for many hours to reproduce the issue >>>> >>>> - only the node where I execute 'service network stop' get fenced >>>> - using fallback_to_local_locking = 0 in lvm.conf LVM LVs remain >>>> writable also while fencing take place >>>> >>>> All seems to work like a charm now. >>>> >>>> I'd like to understand what was happening. I'll try for same day >>>> before >>>> trusting it. >>>> >>>> Thank you so much. >>>> Maurizio >>>> >>> >>> Testing testing testing. It's good that you plan to test before >>> trusting. I wish everyone had that philosophy! >>> >>> The clustered locking for LVM comes into play for >>> activating/inactivating, creating, deleting, resizing and so on. It >>> does not affect what happens in an LV. That's why an LV remains >>> writeable when a fence is pending. However, I feel this is safe >>> because rgmanager won't recover a VM on another node until the lost >>> node is fenced. >>> >>> Cheers >> >> Thank you very much! The cluster continue working like a charm. Failure >> after failure I mean :) >> >> We are not using rgmanager fault management because doesn't have a check >> about the memory availability on the destination node, then we prefer to >> manage this situation with custom script we wrote. >> >> last questions: >> - have you any advice to improve the tollerance against network >> failures? >> - to avoid having a gfs2 only for VM's xml, I've thought to keep them on >> each node synced with rsync. Any alternatives? >> - If I want to have only the clustered LVM without no other functions, >> can you advice about a minimal configuration? (for example I think that >> rgmanager is not necessary) >> >> Thank you in advance > > For network redundancy, I use two switches and bonded (mode=1) links > with one link going to either switch. This way, losing a NIC or a > switch won't break the cluster. Details here: > > https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Network > > Using rsync to keep the XML files in sync is fine, if you really don't > want to use GFS2. > > You do not need rgmanager for clvmd to work. All you need is the base > cluster.conf (and working fencing, as you've seen). > > If you are over-provisioning VMs and need to worry about memory on > target systems, then you might want to take a look at pacemaker. It's > in tech-preview currently and will replace rgmanager in rhel7 (well, > expected to, nothing is guaranteed 'til release day). Pacemaker is > designed, as I understand it, to handle conditions like yours. > Further, it is *much* better tested than anything you roll yourself. > You can use clvmd with pacemaker by tieing cman into pacemaker. > > digime Perfect, I've the same network configuration, on the other cluster I've four switches and I could create two bonds, one is dedicated to corosync then I was afraid that a single bond was little ;) Thank you again