[CentOS] DRBD very slow....

Sun Jul 26 18:08:05 UTC 2009
Coert Waagmeester <lgroups at waagmeester.co.za>

On Fri, 2009-07-24 at 09:27 -0400, Ross Walker wrote:
> On Jul 24, 2009, at 3:28 AM, Coert Waagmeester <lgroups at waagmeester.co.za 
>  > wrote:
> 
> >
> > On Fri, 2009-07-24 at 10:21 +0400, Roman Savelyev wrote:
> >> 1. You are hit by Nagel alghoritm (slow TCP response). You can  
> >> build DRBD
> >> 8.3. In 8.3 "TCP_NODELAY" and "QUICK_RESPONSE" implemented in place.
> >> 2. You are hit by DRBD protocol. In most cases, "B" is enought.
> >> 3. You are hit by triple barriers. In most cases you are need only  
> >> one of
> >> "barrier, flush,  drain" - see documentation, it depens on type of  
> >> storage
> >> hardware.
> >>
> >
> > I have googled the triple barriers thing but cant find that much
> > information.
> >
> > Would it help if I used IPv6 instead of IPv4?
> 
> Triple barriers wouldn't affect you as this is on top of LVM and LVM  
> doesn't support barriers, so it acts like a filter for them. Not good,  
> but that's the state of things.
> 
> I would have run the dd tests locally and not with netcat, the idea is  
> to take the network out of the picture.
> 
I have run the dd again locally.

It writes to an LVM volume on top of Software RAID 1 mounted in dom0:
# dd if=/dev/zero of=/mnt/data/1gig.file oflag=direct bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 24.3603 seconds, 43.0 MB/s

> Given the tests though it looks like the disks have their write caches  
> disabled which cripples them, but with LVM filtering barriers, it's  
> the safest configuration.
> 
> The way to get fast and safe is to use partitions instead of logical  
> volumes. If you need more then 4 then use GPT partition table which  
> allows up to 256 I believe. Then you can enable the disk caches as  
> drbd will issue barrier writes to assure consistency (hmmm maybe the  
> barrier problem is with devmapper which means software RAID will be a  
> problem too? Need to check that).

I am reading up on GPT, and that seems like a viable option.
Will keep you posted.

Most googles point to software raid 1 supporting barriers. not too sure
though.
> 
> Or
> 
> Invest in a HW RAID card with NVRAM cache that will negate the need  
> for barrier writes from the OS as the controller will issue them async  
> from cache allowing I/O to continue flowing. This really is the safest  
> method.
This is not going to be easy.... The servers we use are 1U rackmount,
and the single available PCI-express port is used up on both servers by
a quad gigabit network card.
> 
> -Ross


Thanks for all the valuable tips so far, I will keep you posted.