Hi, On Fri, Apr 15, 2005 at 10:44:32PM -0700, Bryan J. Smith wrote: > > It depends. > "Raw" ATA sucks for multiple operations because it has no I/O queuing. > AHCI is trying to address that, but it's still unintelligent. > 3Ware queues reads/writes very well, and sequences them as best as it can. > But it's still not perfect. Yeah. I knew that. left it out quite in purpose. The marter command queueing is trhe reson why A1000 beats hands down these much faster sequentiqal I/O beasts. There is place for those A1000 boxen on my netwrok too, but it's not my NFS-server which mostly does handle files over 1MB+. > > > Even for /dev/mdX. > > Now with MD, you're starting to taxi your interconnect on writes. > E.g., with Microcontroller or ASIC RAID, you only push the data you write. > With software (including "FRAID"), you push 2x for RAID-1. > That's 2x through your memory, over the system interconnect into the I/O and out the PCI bus. > > When you talk RAID-3/4/5 writes, you slaughter the interconnect. > The bottleneck isn't the CPU. > It's the fact that for each stripe, > you've gotta load from memory through the CPU back to memory - all over the system interconnect, before even looking at I/O. > For 4+ modern ATA disks, your talking a roundtrip that costs you an aggregate percentage of your system interconnect time beyond 30%+. > > On a dynamic web or other CPU computational intensive server, it matters little. > The XOR operations actually use very little CPU power. > And the web or computational streams aren't saturating the interconnect. > But when you are doing file server I/O, and the system interconnect is used for raw bursts of network I/O as much as storage, it kills. That is true too. I don't mind taxing my PCI. I do have dual-opterons doing the crunching and dual PCI-X too. The machine is constantly being over load avg. 4+ due to the fact that it's running several hercules emulator instances on +15 niceness. That doesn't affect to the fact that it's still able to sustain some 50+MB/s over Gbit LAN in and out. I know very well about taxing the PCI-bus. I have all this hardware having some 2Ghz+ Athlon64, dual-channel DDR memory and only puny little 32bit/33Mhz PCI which isn't getting you nowhere. I did actually try a Athlon64 2800+ and RocketRAID 1820A w/ 8x200GB SATA -> some 50MB/s where the PCI was _saturated_. You have very valid point there tho. The I/O saturation counts very much when you build servers. Most people doesn't actually even realize this at all. I am sure too that you 'know your shit much better than i do'. The point just being that 3ware is _SLOW_ compared to almoust anything these days. I do have two 9500S-8 here too. > > > Puny oldish A1000 can beat > those with almoust factor of ten for random I/O, > but being limited to > > max. 40MB/s transfers by it's interface (UW/HVD). > > Or more like the i960 because, after all, RAID should stripe some operations across multiple channels. A1000 is actually poowered by P100. I don't remember seeing i960 in it, but there definetly is some ASIC on board. It's just so much faster for any random I/O operatin than any IDE/SATA setup i've been testing so far. > > > It has nothing to do with CPU cycles but interconnect. XOR puts no > strain on modern CPUs, it's the added data streams being feed from > memory to CPU. Furthermore, using async I/O, MD can actually be > _faster_ than hardware RAID. Volume management in an OS will > typically do much better than a hardware RAID card when it comes to > block writes. Actually it does matter for CPU-cycles too. The initialization for speed of 60MB/s (ie. the MD-driver doing the parity calclation for speed of hundreds of MB/s) it's eating one 1.4Ghz Opteron quite totally. It's also true that HT is making it all fly. Taking some PCI-X enable P4/Xeon at 2.6Ghz just can't get even near the speeds of dual-opteron. it's also true that the kernel itself knows best what is the queueing policy and how the data should be treated. > > Of course the 9500S is still maturing. Which is why I still prefer to > use 4 and 8 channel 7506/8506 cards with RAID-0+1. Even the AccelATA > and 5000 left much to be desired before the 6000 and latter 7000/8000 > series. > Once again. Maturing won't make it's parity engine go over 100MB/s. It's quite dead end AFAIK on that area. Then again, 100MB/s for someone might be enought, but for my testing/flying by feeling, one needs prox. 2x I/O bandwith locally to be serving 1x for NFS - or even near. Same seems to be fact for iSCSI too then tho. As a conclusion i was only trying to make apoint that software solution might be pretty good for someone (for me it's at least now). The 3ware was good for me on dual-PIII which isn't able to go even near the speeds of it with software solution. With dual-opteron the situation is quite different. The 3ware still saturates on it's limits, but the software goes much faster on the capable box. -- Pasi Pirhonen - upi at iki.fi - http://iki.fi/upi/