[CentOS] Unexplained reboots in DRBD82 + OCFS2 setup

Thu Jun 25 15:20:29 UTC 2009
Ian Forde <ian at duckland.org>

On Wed, 2009-06-24 at 07:22 -0700, nate wrote:
> Kris Buytaert wrote:
> >
> >
> > We're trying to setup a dual-primary DRBD environment, with a shared
> > disk with either OCFS2 or GFS.   The environment is a Centos 5.3 with
> > DRBD82 (but also tried with DRBD83 from testing) .
> 
> Both OCFS2 and GFS are meant to be used on SANs with shared storage(same
> LUNs being accessed by multiple servers), I just re-confirmed that DRBD
> is not a shared storage mechanism but just a simple block mirroring
> technology between a couple of nodes(as I originally thought).

Actually, it's both.
http://www.drbd.org/users-guide-emb/ch-fundamentals.html gives the
overview.  It's shared storage with local disk access. And if you're
using Gig-E for the interconnect, it's *fast*. ;)

> I think you are mixing incompatible technologies. Even if you can
> get it working, just seems like a really bad idea.

That functionality is built in.  DRBD fully supports use of OCFS2 on top
of it in dual-primary mode.  See
http://www.drbd.org/users-guide-emb/ch-ocfs2.html

> Perhaps what you could do is setup an iSCSI initiator on your DRBD
> cluster, export a LUN to another cluster running OCFS2 or GFS(last I
> checked GFS required at least 3 nodes less than that and the cluster
> goes to read-only mode, I didn't see any minimum requirements for
> OCFS2).

You could do that, but it would probably be overkill.  Too many moving
parts.  You'd also slow down the speed.  You're talking about app node
-> Gig-E -> OCFS2/GFS cluster -> Gig-E -> iSCSI/DRBD cluster.  I'd
rather have app node -> Gig-E -> OCFS2/DRBD cluster.  And it's *much*
easier to setup.  GFS is a bit of a pita to setup.  I used to do it for
RH professionally and it's not entirely painless...

> Though the whole concept of DRBD just screams to me crap performance
> compared to a real shared storage system, wouldn't touch it with
> a 50 foot pole myself.

Nah... performance is pretty sweet.  Local disk access, sub-second
resync after rebooting one of the nodes, and the cost is *much* lower
than a "real" shared-storage system... if cost is a factor, I'd
seriously consider trialing the DRBD/OCFS2 combo.