Hi,
On Fri, Apr 15, 2005 at 10:44:32PM -0700, Bryan J. Smith wrote:
It depends. "Raw" ATA sucks for multiple operations because it has no I/O queuing. AHCI is trying to address that, but it's still unintelligent. 3Ware queues reads/writes very well, and sequences them as best as it can. But it's still not perfect.
Yeah. I knew that. left it out quite in purpose. The marter command queueing is trhe reson why A1000 beats hands down these much faster sequentiqal I/O beasts. There is place for those A1000 boxen on my netwrok too, but it's not my NFS-server which mostly does handle files over 1MB+.
Even for /dev/mdX.
Now with MD, you're starting to taxi your interconnect on writes. E.g., with Microcontroller or ASIC RAID, you only push the data you write. With software (including "FRAID"), you push 2x for RAID-1. That's 2x through your memory, over the system interconnect into the I/O and out the PCI bus.
When you talk RAID-3/4/5 writes, you slaughter the interconnect. The bottleneck isn't the CPU. It's the fact that for each stripe, you've gotta load from memory through the CPU back to memory - all over the system interconnect, before even looking at I/O. For 4+ modern ATA disks, your talking a roundtrip that costs you an aggregate percentage of your system interconnect time beyond 30%+.
On a dynamic web or other CPU computational intensive server, it matters little. The XOR operations actually use very little CPU power. And the web or computational streams aren't saturating the interconnect. But when you are doing file server I/O, and the system interconnect is used for raw bursts of network I/O as much as storage, it kills.
That is true too. I don't mind taxing my PCI. I do have dual-opterons doing the crunching and dual PCI-X too. The machine is constantly being over load avg. 4+ due to the fact that it's running several hercules emulator instances on +15 niceness. That doesn't affect to the fact that it's still able to sustain some 50+MB/s over Gbit LAN in and out.
I know very well about taxing the PCI-bus. I have all this hardware having some 2Ghz+ Athlon64, dual-channel DDR memory and only puny little 32bit/33Mhz PCI which isn't getting you nowhere. I did actually try a Athlon64 2800+ and RocketRAID 1820A w/ 8x200GB SATA -> some 50MB/s where the PCI was _saturated_.
You have very valid point there tho. The I/O saturation counts very much when you build servers. Most people doesn't actually even realize this at all. I am sure too that you 'know your shit much better than i do'.
The point just being that 3ware is _SLOW_ compared to almoust anything these days. I do have two 9500S-8 here too.
Puny oldish A1000 can beat
those with almoust factor of ten for random I/O, but being limited to
max. 40MB/s transfers by it's interface (UW/HVD).
Or more like the i960 because, after all, RAID should stripe some operations across multiple channels.
A1000 is actually poowered by P100. I don't remember seeing i960 in it, but there definetly is some ASIC on board.
It's just so much faster for any random I/O operatin than any IDE/SATA setup i've been testing so far.
It has nothing to do with CPU cycles but interconnect. XOR puts no strain on modern CPUs, it's the added data streams being feed from memory to CPU. Furthermore, using async I/O, MD can actually be _faster_ than hardware RAID. Volume management in an OS will typically do much better than a hardware RAID card when it comes to block writes.
Actually it does matter for CPU-cycles too. The initialization for speed of 60MB/s (ie. the MD-driver doing the parity calclation for speed of hundreds of MB/s) it's eating one 1.4Ghz Opteron quite totally.
It's also true that HT is making it all fly. Taking some PCI-X enable P4/Xeon@2.6Ghz just can't get even near the speeds of dual-opteron.
it's also true that the kernel itself knows best what is the queueing policy and how the data should be treated.
Of course the 9500S is still maturing. Which is why I still prefer to use 4 and 8 channel 7506/8506 cards with RAID-0+1. Even the AccelATA and 5000 left much to be desired before the 6000 and latter 7000/8000 series.
Once again. Maturing won't make it's parity engine go over 100MB/s. It's quite dead end AFAIK on that area. Then again, 100MB/s for someone might be enought, but for my testing/flying by feeling, one needs prox. 2x I/O bandwith locally to be serving 1x for NFS - or even near. Same seems to be fact for iSCSI too then tho.
As a conclusion i was only trying to make apoint that software solution might be pretty good for someone (for me it's at least now). The 3ware was good for me on dual-PIII which isn't able to go even near the speeds of it with software solution. With dual-opteron the situation is quite different. The 3ware still saturates on it's limits, but the software goes much faster on the capable box.