[CentOS] DRBD very slow....
Coert Waagmeester
lgroups at waagmeester.co.za
Sun Jul 26 18:08:05 UTC 2009
On Fri, 2009-07-24 at 09:27 -0400, Ross Walker wrote:
> On Jul 24, 2009, at 3:28 AM, Coert Waagmeester <lgroups at waagmeester.co.za
> > wrote:
>
> >
> > On Fri, 2009-07-24 at 10:21 +0400, Roman Savelyev wrote:
> >> 1. You are hit by Nagel alghoritm (slow TCP response). You can
> >> build DRBD
> >> 8.3. In 8.3 "TCP_NODELAY" and "QUICK_RESPONSE" implemented in place.
> >> 2. You are hit by DRBD protocol. In most cases, "B" is enought.
> >> 3. You are hit by triple barriers. In most cases you are need only
> >> one of
> >> "barrier, flush, drain" - see documentation, it depens on type of
> >> storage
> >> hardware.
> >>
> >
> > I have googled the triple barriers thing but cant find that much
> > information.
> >
> > Would it help if I used IPv6 instead of IPv4?
>
> Triple barriers wouldn't affect you as this is on top of LVM and LVM
> doesn't support barriers, so it acts like a filter for them. Not good,
> but that's the state of things.
>
> I would have run the dd tests locally and not with netcat, the idea is
> to take the network out of the picture.
>
I have run the dd again locally.
It writes to an LVM volume on top of Software RAID 1 mounted in dom0:
# dd if=/dev/zero of=/mnt/data/1gig.file oflag=direct bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 24.3603 seconds, 43.0 MB/s
> Given the tests though it looks like the disks have their write caches
> disabled which cripples them, but with LVM filtering barriers, it's
> the safest configuration.
>
> The way to get fast and safe is to use partitions instead of logical
> volumes. If you need more then 4 then use GPT partition table which
> allows up to 256 I believe. Then you can enable the disk caches as
> drbd will issue barrier writes to assure consistency (hmmm maybe the
> barrier problem is with devmapper which means software RAID will be a
> problem too? Need to check that).
I am reading up on GPT, and that seems like a viable option.
Will keep you posted.
Most googles point to software raid 1 supporting barriers. not too sure
though.
>
> Or
>
> Invest in a HW RAID card with NVRAM cache that will negate the need
> for barrier writes from the OS as the controller will issue them async
> from cache allowing I/O to continue flowing. This really is the safest
> method.
This is not going to be easy.... The servers we use are 1U rackmount,
and the single available PCI-express port is used up on both servers by
a quad gigabit network card.
>
> -Ross
Thanks for all the valuable tips so far, I will keep you posted.
More information about the CentOS
mailing list