so I've had a drbd replica running for a while of a 16TB raid thats used as a backuppc repository.
when I have rebooted the backuppc server, the replica doesn't seem to auto-restart til I do it manually, and the backupc /data file system on this 16TB LUN doesn't seem to automount, either.
I've rebooted this thing a few times in the 18 months or so its been running... not always cleanly...
anyways, I'm started a drbd verify (from the slave) about 10 hours ago, it has 15 hours more to run, and so far it's logged...
Oct 11 13:58:26 sg2 kernel: block drbd0: Starting Online Verify from sector 3534084704 Oct 11 14:00:23 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:00:29 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 14:00:35 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967293 Oct 11 14:00:41 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967292 Oct 11 14:01:16 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:02:05 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:02:11 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 14:02:17 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967293 Oct 11 14:33:41 sg2 kernel: block drbd0: Out of sync: start=3932979480, size=8 (sectors) Oct 11 14:34:46 sg2 kernel: block drbd0: Out of sync: start=3946056120, size=8 (sectors) Oct 11 15:37:07 sg2 kernel: block drbd0: Out of sync: start=4696809024, size=8 (sectors) Oct 11 17:08:15 sg2 kernel: block drbd0: Out of sync: start=6084949528, size=8 (sectors) Oct 11 17:30:53 sg2 kernel: block drbd0: Out of sync: start=6567543472, size=8 (sectors) Oct 11 17:59:04 sg2 kernel: block drbd0: Out of sync: start=7169767896, size=8 (sectors) Oct 11 20:00:50 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:01:09 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:01:15 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 20:01:29 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:29:18 sg2 kernel: block drbd0: Out of sync: start=10362907296, size=8 (sectors) Oct 11 20:29:54 sg2 kernel: block drbd0: Out of sync: start=10375790488, size=8 (sectors) Oct 11 21:01:51 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 21:42:15 sg2 kernel: block drbd0: Out of sync: start=11907921096, size=8 (sectors) Oct 11 21:43:38 sg2 kernel: block drbd0: Out of sync: start=11937086248, size=8 (sectors) Oct 11 21:44:00 sg2 kernel: block drbd0: Out of sync: start=11944705032, size=8 (sectors) Oct 11 21:49:26 sg2 kernel: block drbd0: Out of sync: start=12062270432, size=8 (sectors) Oct 11 22:07:10 sg2 kernel: block drbd0: Out of sync: start=12440235128, size=8 (sectors) Oct 11 22:58:54 sg2 kernel: block drbd0: Out of sync: start=13548501984, size=8 (sectors) Oct 11 23:23:17 sg2 kernel: block drbd0: Out of sync: start=14072873320, size=8 (sectors) $ date Sat Oct 11 23:28:11 PDT 2014
its 35% done at this point... 15 4K blocks out wrong of 1/3rd of 16TB isn't a lot, but its still more than I like to see.
$ cat /proc/drbd version: 8.3.15 (api:88/proto:86-97) GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by phil@Build64R6, 2012-12-20 20:09:51 0: cs:VerifyS ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:105707 dw:187685496 dr:654444832 al:0 bm:1 lo:107 pe:2104 ua:435 ap:0 ep:1 wo:f oos:60 [=====>..............] verified: 34.6% (9846140/15051076)M finish: 14:55:27 speed: 187,648 (155,708) want: 204,800 K/sec
really, if I let this complete, then disconnect/reconnect the replica, it will repair these glitches ? I'm gathering I shoudl schedule these verifies weekly or something.
On 10/11/2014 11:30 PM, John R Pierce wrote:
so I've had a drbd replica running for a while of a 16TB raid thats used as a backuppc repository.
oh. this is running on a pair of centos 6.latest boxes, each dual xeon x5650 w/ 48GB ram, with LSI SAS2 raid card hooked up to a whole lotta sas/sata drives.
On 12/10/14 04:07 AM, John R Pierce wrote:
On 10/11/2014 11:30 PM, John R Pierce wrote:
so I've had a drbd replica running for a while of a 16TB raid thats used as a backuppc repository.
oh. this is running on a pair of centos 6.latest boxes, each dual xeon x5650 w/ 48GB ram, with LSI SAS2 raid card hooked up to a whole lotta sas/sata drives.
What version of DRBD? If you're using a cluster resource manager, which and what version? How is DRBD configured and, if you are using a resource manager, what is it's config?
On 12/10/14 02:30 AM, John R Pierce wrote:
so I've had a drbd replica running for a while of a 16TB raid thats used as a backuppc repository.
when I have rebooted the backuppc server, the replica doesn't seem to auto-restart til I do it manually, and the backupc /data file system on this 16TB LUN doesn't seem to automount, either.
I've rebooted this thing a few times in the 18 months or so its been running... not always cleanly...
anyways, I'm started a drbd verify (from the slave) about 10 hours ago, it has 15 hours more to run, and so far it's logged...
Oct 11 13:58:26 sg2 kernel: block drbd0: Starting Online Verify from sector 3534084704 Oct 11 14:00:23 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:00:29 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 14:00:35 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967293 Oct 11 14:00:41 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967292 Oct 11 14:01:16 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:02:05 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 14:02:11 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 14:02:17 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967293 Oct 11 14:33:41 sg2 kernel: block drbd0: Out of sync: start=3932979480, size=8 (sectors) Oct 11 14:34:46 sg2 kernel: block drbd0: Out of sync: start=3946056120, size=8 (sectors) Oct 11 15:37:07 sg2 kernel: block drbd0: Out of sync: start=4696809024, size=8 (sectors) Oct 11 17:08:15 sg2 kernel: block drbd0: Out of sync: start=6084949528, size=8 (sectors) Oct 11 17:30:53 sg2 kernel: block drbd0: Out of sync: start=6567543472, size=8 (sectors) Oct 11 17:59:04 sg2 kernel: block drbd0: Out of sync: start=7169767896, size=8 (sectors) Oct 11 20:00:50 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:01:09 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:01:15 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967294 Oct 11 20:01:29 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 20:29:18 sg2 kernel: block drbd0: Out of sync: start=10362907296, size=8 (sectors) Oct 11 20:29:54 sg2 kernel: block drbd0: Out of sync: start=10375790488, size=8 (sectors) Oct 11 21:01:51 sg2 kernel: block drbd0: [drbd0_worker/2197] sock_sendmsg time expired, ko = 4294967295 Oct 11 21:42:15 sg2 kernel: block drbd0: Out of sync: start=11907921096, size=8 (sectors) Oct 11 21:43:38 sg2 kernel: block drbd0: Out of sync: start=11937086248, size=8 (sectors) Oct 11 21:44:00 sg2 kernel: block drbd0: Out of sync: start=11944705032, size=8 (sectors) Oct 11 21:49:26 sg2 kernel: block drbd0: Out of sync: start=12062270432, size=8 (sectors) Oct 11 22:07:10 sg2 kernel: block drbd0: Out of sync: start=12440235128, size=8 (sectors) Oct 11 22:58:54 sg2 kernel: block drbd0: Out of sync: start=13548501984, size=8 (sectors) Oct 11 23:23:17 sg2 kernel: block drbd0: Out of sync: start=14072873320, size=8 (sectors) $ date Sat Oct 11 23:28:11 PDT 2014
its 35% done at this point... 15 4K blocks out wrong of 1/3rd of 16TB isn't a lot, but its still more than I like to see.
$ cat /proc/drbd version: 8.3.15 (api:88/proto:86-97) GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by phil@Build64R6, 2012-12-20 20:09:51 0: cs:VerifyS ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:105707 dw:187685496 dr:654444832 al:0 bm:1 lo:107 pe:2104 ua:435 ap:0 ep:1 wo:f oos:60 [=====>..............] verified: 34.6% (9846140/15051076)M finish: 14:55:27 speed: 187,648 (155,708) want: 204,800 K/sec
really, if I let this complete, then disconnect/reconnect the replica, it will repair these glitches ? I'm gathering I shoudl schedule these verifies weekly or something.
That the backing device of one node fell out of sync is a cause concern. "Weekly" scan might be a bit much, but monthly or so isn't unreasonable. Of course, as you're seeing here, it's a lengthy process and it consumes non-trivial amounts of bandwidth and adds a fair load to the disks.
How long was it in production before this verify?
I can't speak to backuppc, but I am curious how you're managing the resources. Are you using cman + rgmanager or pacemaker?
On 10/12/2014 9:30 AM, Digimer wrote:
I can't speak to backuppc, but I am curious how you're managing the resources. Are you using cman + rgmanager or pacemaker?
strictly manually, with drbdadm and such. if the primary backup server ever fails, I'll bring up the backup by hand.
On 10/12/2014 9:31 AM, Digimer wrote:
What version of DRBD? If you're using a cluster resource manager, which and what version? How is DRBD configured and, if you are using a resource manager, what is it's config?
its the drbd8.3 package from elrepo.
/etc/drbd.d/global_common.conf:
global { usage-count no; }
common { syncer { rate 200M; } }
/etc/drbd.d/main.res:
resource main {
protocol C; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; }
syncer { verify-alg crc32c; }
on sg1.domain.com { device /dev/drbd0; disk /dev/vg_sg1data/lvdata; meta-disk internal; address 10.5.160.70:7788; }
on sg2.domain.com { device /dev/drbd0; disk /dev/vg_sg2data/lvcopy; meta-disk internal; address 10.5.160.71:7788; }
}
On 12/10/14 02:52 PM, John R Pierce wrote:
On 10/12/2014 9:30 AM, Digimer wrote:
I can't speak to backuppc, but I am curious how you're managing the resources. Are you using cman + rgmanager or pacemaker?
strictly manually, with drbdadm and such. if the primary backup server ever fails, I'll bring up the backup by hand.
OK. So what's your start-up procedure? How should things start automatically then and where exactly is backuppc hanging up? Is it trying to start before DRBD is Primary?
On 10/12/2014 9:31 AM, Digimer wrote:
What version of DRBD? If you're using a cluster resource manager, which and what version? How is DRBD configured and, if you are using a resource manager, what is it's config?
its the drbd8.3 package from elrepo.
So 8.3.16?
/etc/drbd.d/global_common.conf:
global { usage-count no; } common { syncer { rate 200M; } }
Is this a 10 Gbps + very fast storage setup? If not, that is probably *way* too high.
/etc/drbd.d/main.res:
resource main { protocol C; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; } syncer { verify-alg crc32c; } on sg1.domain.com { device /dev/drbd0; disk /dev/vg_sg1data/lvdata; meta-disk internal; address 10.5.160.70:7788; } on sg2.domain.com { device /dev/drbd0; disk /dev/vg_sg2data/lvcopy; meta-disk internal; address 10.5.160.71:7788; } }
What does 'drbdadm dump' show? That will give a better idea of the actual setup. I'm interested specifically in the start parameters (ie: become-primary-on, etc).
On 10/12/2014 12:30 PM, Digimer wrote:
On 12/10/14 02:52 PM, John R Pierce wrote:
On 10/12/2014 9:30 AM, Digimer wrote:
I can't speak to backuppc, but I am curious how you're managing the resources. Are you using cman + rgmanager or pacemaker?
strictly manually, with drbdadm and such. if the primary backup server ever fails, I'll bring up the backup by hand.
OK. So what's your start-up procedure? How should things start automatically then and where exactly is backuppc hanging up? Is it trying to start before DRBD is Primary?
Just autostarting the demon with chkconfig on, didn't do anything else special. when I reboot the master, its coming up secondary/secondary, and I manually have to kick it to primary then mount the file system, then start the backuppc service.
its the drbd8.3 package from elrepo.
So 8.3.16?
yeah. drbd83-utils-8.3.16-1.el6.elrepo.x86_64
syncer { rate 200M; }
}
it is fairly fast storage, and gigE. I had temporarily tweaked that up to 200M to speed up the verify process, it was set to 50M before, and I just now set it back.
What does 'drbdadm dump' show? That will give a better idea of the actual setup. I'm interested specifically in the start parameters (ie: become-primary-on, etc).
# /etc/drbd.conf common { syncer { rate 50M; } }
# resource main on svfis-sg1.netsys.stsv.seagate.com: not ignored, not stacked resource main { protocol C; on sg1.domain.com { device /dev/drbd0 minor 0; disk /dev/vg_sg1data/lvdata; address ipv4 10.x.x.70:7788; meta-disk internal; } on sg2.domain.com { device /dev/drbd0 minor 0; disk /dev/vg_sg2data/lvcopy; address ipv4 10.x.x.71:7788; meta-disk internal; } disk { on-io-error detach; } syncer { verify-alg crc32c; } startup { wfc-timeout 0; degr-wfc-timeout 120; } }
On 12/10/14 03:55 PM, John R Pierce wrote:
On 10/12/2014 12:30 PM, Digimer wrote:
On 12/10/14 02:52 PM, John R Pierce wrote:
On 10/12/2014 9:30 AM, Digimer wrote:
I can't speak to backuppc, but I am curious how you're managing the resources. Are you using cman + rgmanager or pacemaker?
strictly manually, with drbdadm and such. if the primary backup server ever fails, I'll bring up the backup by hand.
OK. So what's your start-up procedure? How should things start automatically then and where exactly is backuppc hanging up? Is it trying to start before DRBD is Primary?
Just autostarting the demon with chkconfig on, didn't do anything else special. when I reboot the master, its coming up secondary/secondary, and I manually have to kick it to primary then mount the file system, then start the backuppc service.
Ok, so you might want to set 'become-primary-on <master node>'. Then on start, the named node should auto-promote to Primary automatically.
its the drbd8.3 package from elrepo.
So 8.3.16?
yeah. drbd83-utils-8.3.16-1.el6.elrepo.x86_64
syncer { rate 200M; }
}
it is fairly fast storage, and gigE. I had temporarily tweaked that up to 200M to speed up the verify process, it was set to 50M before, and I just now set it back.
If it's only 1 Gbps, then the maximum sustainable write speed is ~120 MB/sec (at most; the slower of network or disk determines the max speed). You want the sync rate to be ~30% of maximum speed, or else you will choke out the apps using the DRBD resource and cause them to suffer a huge performance hit. App performance is (max write - sync rate) when a background sync is performed. This is why the default speed is very low.
In your case, you've set the sync rate way above what is possible, and I've seen this dramatically hurt performance and even full-on stall the sync operation. I would set this to '40M' at most.
What does 'drbdadm dump' show? That will give a better idea of the actual setup. I'm interested specifically in the start parameters (ie: become-primary-on, etc).
# /etc/drbd.conf common { syncer { rate 50M;
This contradicts your configuration, I wonder if DRBD dropped it for you.
}
}
# resource main on svfis-sg1.netsys.stsv.seagate.com: not ignored, not stacked resource main { protocol C; on sg1.domain.com { device /dev/drbd0 minor 0; disk /dev/vg_sg1data/lvdata; address ipv4 10.x.x.70:7788; meta-disk internal; } on sg2.domain.com { device /dev/drbd0 minor 0; disk /dev/vg_sg2data/lvcopy; address ipv4 10.x.x.71:7788; meta-disk internal; } disk { on-io-error detach; } syncer { verify-alg crc32c; } startup { wfc-timeout 0; degr-wfc-timeout 120; } }
That's it? Nothing more? I would expect a lot more, normally. In any case, given it's a totally manual config, I support that is enough.
On 10/12/2014 1:04 PM, Digimer wrote:
If it's only 1 Gbps, then the maximum sustainable write speed is ~120 MB/sec (at most; the slower of network or disk determines the max speed). You want the sync rate to be ~30% of maximum speed, or else you will choke out the apps using the DRBD resource and cause them to suffer a huge performance hit. App performance is (max write - sync rate) when a background sync is performed. This is why the default speed is very low.
In your case, you've set the sync rate way above what is possible, and I've seen this dramatically hurt performance and even full-on stall the sync operation. I would set this to '40M' at most.
verify apparently uses this same speed to limit how fast it runs... since its doing checksums on each side, its not actually transferring any data across the network, hence why I boosted it to 200M (that left the node that initiated the verify with one process at 90% CPU, whihc I figure is as fast as it can go)... as I said, I've set it back to 50M now that the verify completed.
btw, originally I had the verify-alg as sha1... my xeon X5650 2.7Ghz could only verify about 23MB/sec. setting it to crc32c sped up to 180-200MB/sec, which is more reasonable considering this is a 16TB volume built from 2 8x3TB drive raid6's striped together, behind a lsi 9261-8i megaraid sas2 card with 512MB BBU cache.