[CentOS] who uses Lustre in production with virtual machines?

Wed Aug 4 06:57:47 UTC 2010
Emmanuel Noobadmin <centos.admin at gmail.com>

On 8/4/10, Rajagopal Swaminathan <raju.rajsand at gmail.com> wrote:
> Dunno if its relevent, but are we talking about inband/power or
> storage fencing issues for stonith here?
>
> COZ, HA, to the best of my knowledge, requires some form of fencing...

Typically yes, but it doesn't necessarily require STONITH since there
is also the quorum approach. i.e, 7 nodes in the cluster, any node
which cannot contact at least 3 other nodes considers itself orphaned
and on reconnect, sync from the majority. So no STONITH, only temporal
fencing until inconsistencies are sync'd transparently.

Unfortunately Gluster does not have a quorum mechanism as is.
Otherwise along with the self-healing characteristics, it would be
ideal for HA storage.

As it is, from my understanding, gluster will block access to
ambiguous files until manual intervention deletes all but one desired
copy. Might not really be an issue though unless split rates are high.
With redundant network switches/paths, this might never be an issue
since it should never happen that two isolated nodes are alive but
write-able by guest systems to cause two updated but different copies
of the same file.

> and split-brain is the result of failure of HA mechanism.
>
> <bit cofused ?? !! ???>
>
> Please correct me if I am wrong in any of my understanding..

>
> I am in a learning phase.

I'm also in the learning process so don't trust my words on this either :)

> But then is ZFS a Cluster filesystem at all like GFS2/OCFS? Haven't
> studied that angle as yet.

ZFS is a local file system as far as I understand it. It's by Solaris
but there are two efforts to port it to Linux, one through userspace
via Fuse and the other through kernel. It seems like the Fuse approach
is more matured and at the moment slightly more desirable from my POV
because no messing around with kernel/recompile needed.

The main thing for me is that ZFS comes with inode/sector ECC
functionality so that would catch "soft" hardware errors such as a
flaky data cable that's silently corrupting data without any
immediately observable effect.

It also has RAID functionality but I've seen various reports of failed
zpool that couldn't be easily recovered. So my most likely
configuration is to configure glusterfs on top of zfs (for the ECC) on
top of mdraid 1 (for redundancy and ease of recovery)