Re: [CentOS] Question about clustering

18 Jun 2014


      Il 17/06/2014 16:32, Digimer ha scritto:
...
On 17/06/14 10:23 AM, Denniston, Todd A CIV NAVSURFWARCENDIV Crane wrote:
...
...
-----Original Message-----
From: Digimer [mailto:lists@alteeve.ca]
Sent: Monday, June 16, 2014 3:20 PM
To: CentOS mailing list
Subject: Re: [CentOS] Question about clustering
On 16/06/14 02:55 PM, m.roth@5-cent.us wrote:
<SNIP>
>> One can also set the cluster nodes to failover, and when the failed node
>> comes up, to *not* try to take back the services, leaving it in a state
>> for you to fix it.
>>
>>            mark, first work on h/a clusters 1997-2001
>
> Failover and recovery are secondary to fencing. The surviving node(s)
> can't begin recovery until the lost node is in a known state. To make an
> assumption about the node's state (by, for example, assuming that no
> access to the node is sufficient to determine it is off) is to risk a
> split-brain. Even something as relatively "minor" as a floating IP can
> potentially cause problems with ARP, for example.
>
> Cheers
Having operated a file serving cluster for a few years (~2001-2006) without ANY fencing device, I can tell you that it causes split-brain in the admins too, i.e., I AGREE.
To which I can use the analogy that in the 18 years I've driven a car,
I've never needed my seat belt or airbags. I still put my seatbelt on
every time I go anywhere though, and I won't buy a car without airbags. ;)
...
Earlier, Alessandro Baggi wrote:
...
there is a chance to make fencing without hardware, but only software?
To which Digimer, answered: No. <SNIP info about fence device independence>
However, there is an *Almost* software only fence.
If you goal is high-availability, there is a strong argument that
"almost" isn't enough.
...
Unfortunately  for me I learned about (or at least understood) the stonith devices late in the above system's life.  I expect even meatware stonith[1]  could have saved me considerable pain five or six times.
Manual fencing was dropped as a supported fence method in RHEL 6 because
it was too prone to human mistakes. When an HA cluster is hung and an
admin who might not have touched the cluster in months has users and
managers yelling at them, mistakes with potentially massive consequences
happen.
Manual fencing is just not safe.
...
Understand that I am not recommending meatware stonith to be a good operational stonith device, see [2] for how much subtle understanding the meat has to have, but it would be much better than NO operational stonith device.
Bingo on the meat, disagree on "no stonith" at all. A cluster must have
fencing.
...
[1] http://clusterlabs.org/doc/crm_fencing.html#_meatware
[2] http://oss.clusterlabs.org/pipermail/pacemaker/2011-June/010693.html
Even when this disclaimer is not here:
I am not a contracting officer. I do not have authority to make or modify the terms of any contract.
Cheers
Ok, fencing is a requirement for a cluster for hardware failure.
I've  another question about this arg, but for software failure.
Supposing to have a cluster of httpd installation on 6 virtualized 
hosts, each one on a different server. Suppose also that a guest (named 
host6) has a problem and can't start apache. With this scenario, the 
ipmi, ups are unnecessary. How to work fencing in this way? How to make 
fencing node?
Thanks in advance.
Alessandro.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] Question about clustering