[CentOS] Question about clustering

Alessandro Baggi

alessandro.baggi at gmail.com
Wed Jun 18 16:32:27 UTC 2014


Il 17/06/2014 16:32, Digimer ha scritto:
> On 17/06/14 10:23 AM, Denniston, Todd A CIV NAVSURFWARCENDIV Crane wrote:
>>> -----Original Message-----
>>> From: Digimer [mailto:lists at alteeve.ca]
>>> Sent: Monday, June 16, 2014 3:20 PM
>>> To: CentOS mailing list
>>> Subject: Re: [CentOS] Question about clustering
>>>
>>> On 16/06/14 02:55 PM, m.roth at 5-cent.us wrote:
>> <SNIP>
>>>> One can also set the cluster nodes to failover, and when the failed node
>>>> comes up, to *not* try to take back the services, leaving it in a state
>>>> for you to fix it.
>>>>
>>>>            mark, first work on h/a clusters 1997-2001
>>>
>>> Failover and recovery are secondary to fencing. The surviving node(s)
>>> can't begin recovery until the lost node is in a known state. To make an
>>> assumption about the node's state (by, for example, assuming that no
>>> access to the node is sufficient to determine it is off) is to risk a
>>> split-brain. Even something as relatively "minor" as a floating IP can
>>> potentially cause problems with ARP, for example.
>>>
>>> Cheers
>>
>> Having operated a file serving cluster for a few years (~2001-2006) without ANY fencing device, I can tell you that it causes split-brain in the admins too, i.e., I AGREE.
>
> To which I can use the analogy that in the 18 years I've driven a car,
> I've never needed my seat belt or airbags. I still put my seatbelt on
> every time I go anywhere though, and I won't buy a car without airbags. ;)
>
>> Earlier, Alessandro Baggi wrote:
>>> there is a chance to make fencing without hardware, but only software?
>> To which Digimer, answered: No. <SNIP info about fence device independence>
>>
>> However, there is an *Almost* software only fence.
>
> If you goal is high-availability, there is a strong argument that
> "almost" isn't enough.
>
>>    Unfortunately  for me I learned about (or at least understood) the stonith devices late in the above system's life.  I expect even meatware stonith[1]  could have saved me considerable pain five or six times.
>
> Manual fencing was dropped as a supported fence method in RHEL 6 because
> it was too prone to human mistakes. When an HA cluster is hung and an
> admin who might not have touched the cluster in months has users and
> managers yelling at them, mistakes with potentially massive consequences
> happen.
>
> Manual fencing is just not safe.
>
>> Understand that I am not recommending meatware stonith to be a good operational stonith device, see [2] for how much subtle understanding the meat has to have, but it would be much better than NO operational stonith device.
>
> Bingo on the meat, disagree on "no stonith" at all. A cluster must have
> fencing.
>
>> [1] http://clusterlabs.org/doc/crm_fencing.html#_meatware
>> [2] http://oss.clusterlabs.org/pipermail/pacemaker/2011-June/010693.html
>>
>> Even when this disclaimer is not here:
>> I am not a contracting officer. I do not have authority to make or modify the terms of any contract.
>
> Cheers
>

Ok, fencing is a requirement for a cluster for hardware failure.
I've  another question about this arg, but for software failure.
Supposing to have a cluster of httpd installation on 6 virtualized 
hosts, each one on a different server. Suppose also that a guest (named 
host6) has a problem and can't start apache. With this scenario, the 
ipmi, ups are unnecessary. How to work fencing in this way? How to make 
fencing node?

Thanks in advance.

Alessandro.



More information about the CentOS mailing list