[CentOS] drbd

Sun Oct 12 20:04:29 UTC 2014
Digimer <lists at alteeve.ca>

On 12/10/14 03:55 PM, John R Pierce wrote:
> On 10/12/2014 12:30 PM, Digimer wrote:
>> On 12/10/14 02:52 PM, John R Pierce wrote:
>>> On 10/12/2014 9:30 AM, Digimer wrote:
>>>> I can't speak to backuppc, but I am curious how you're managing the
>>>> resources. Are you using cman + rgmanager or pacemaker?
>>>
>>> strictly manually, with drbdadm and such.   if the primary backup server
>>> ever fails, I'll bring up the backup by hand.
>>
>> OK. So what's your start-up procedure? How should things start
>> automatically then and where exactly is backuppc hanging up? Is it
>> trying to start before DRBD is Primary?
>
> Just autostarting the demon with chkconfig on, didn't do anything else
> special.     when I reboot the master, its coming up
> secondary/secondary, and I manually have to kick it to primary then
> mount the file system, then start the backuppc service.

Ok, so you might want to set 'become-primary-on <master node>'. Then on 
start, the named node should auto-promote to Primary automatically.

>>> its the drbd8.3 package from elrepo.
>>
>> So 8.3.16?
>
> yeah.  drbd83-utils-8.3.16-1.el6.elrepo.x86_64
>
>
>> syncer { rate 200M; }
>>>     }
>
> it is fairly fast storage, and gigE.  I had temporarily tweaked that up
> to 200M to speed up the verify process, it was set to 50M before, and I
> just now set it back.

If it's only 1 Gbps, then the maximum sustainable write speed is ~120 
MB/sec (at most; the slower of network or disk determines the max 
speed). You want the sync rate to be ~30% of maximum speed, or else you 
will choke out the apps using the DRBD resource and cause them to suffer 
a huge performance hit. App performance is (max write - sync rate) when 
a background sync is performed. This is why the default speed is very low.

In your case, you've set the sync rate way above what is possible, and 
I've seen this dramatically hurt performance and even full-on stall the 
sync operation. I would set this to '40M' at most.

>> What does 'drbdadm dump' show? That will give a better idea of the
>> actual setup. I'm interested specifically in the start parameters (ie:
>> become-primary-on, etc).
>
>
> # /etc/drbd.conf
> common {
>      syncer {
>          rate             50M;

This contradicts your configuration, I wonder if DRBD dropped it for you.

>      }
> }
>
> # resource main on svfis-sg1.netsys.stsv.seagate.com: not ignored, not
> stacked
> resource main {
>      protocol               C;
>      on sg1.domain.com {
>          device           /dev/drbd0 minor 0;
>          disk             /dev/vg_sg1data/lvdata;
>          address          ipv4 10.x.x.70:7788;
>          meta-disk        internal;
>      }
>      on sg2.domain.com {
>          device           /dev/drbd0 minor 0;
>          disk             /dev/vg_sg2data/lvcopy;
>          address          ipv4 10.x.x.71:7788;
>          meta-disk        internal;
>      }
>      disk {
>          on-io-error      detach;
>      }
>      syncer {
>          verify-alg       crc32c;
>      }
>      startup {
>          wfc-timeout        0;
>          degr-wfc-timeout 120;
>      }
> }

That's it? Nothing more? I would expect a lot more, normally. In any 
case, given it's a totally manual config, I support that is enough.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?