On Fri, 2009-07-24 at 09:27 -0400, Ross Walker wrote: > On Jul 24, 2009, at 3:28 AM, Coert Waagmeester <lgroups at waagmeester.co.za > > wrote: > > > > > On Fri, 2009-07-24 at 10:21 +0400, Roman Savelyev wrote: > >> 1. You are hit by Nagel alghoritm (slow TCP response). You can > >> build DRBD > >> 8.3. In 8.3 "TCP_NODELAY" and "QUICK_RESPONSE" implemented in place. > >> 2. You are hit by DRBD protocol. In most cases, "B" is enought. > >> 3. You are hit by triple barriers. In most cases you are need only > >> one of > >> "barrier, flush, drain" - see documentation, it depens on type of > >> storage > >> hardware. > >> > > > > I have googled the triple barriers thing but cant find that much > > information. > > > > Would it help if I used IPv6 instead of IPv4? > > Triple barriers wouldn't affect you as this is on top of LVM and LVM > doesn't support barriers, so it acts like a filter for them. Not good, > but that's the state of things. > > I would have run the dd tests locally and not with netcat, the idea is > to take the network out of the picture. > I have run the dd again locally. It writes to an LVM volume on top of Software RAID 1 mounted in dom0: # dd if=/dev/zero of=/mnt/data/1gig.file oflag=direct bs=1M count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 24.3603 seconds, 43.0 MB/s > Given the tests though it looks like the disks have their write caches > disabled which cripples them, but with LVM filtering barriers, it's > the safest configuration. > > The way to get fast and safe is to use partitions instead of logical > volumes. If you need more then 4 then use GPT partition table which > allows up to 256 I believe. Then you can enable the disk caches as > drbd will issue barrier writes to assure consistency (hmmm maybe the > barrier problem is with devmapper which means software RAID will be a > problem too? Need to check that). I am reading up on GPT, and that seems like a viable option. Will keep you posted. Most googles point to software raid 1 supporting barriers. not too sure though. > > Or > > Invest in a HW RAID card with NVRAM cache that will negate the need > for barrier writes from the OS as the controller will issue them async > from cache allowing I/O to continue flowing. This really is the safest > method. This is not going to be easy.... The servers we use are 1U rackmount, and the single available PCI-express port is used up on both servers by a quad gigabit network card. > > -Ross Thanks for all the valuable tips so far, I will keep you posted.