On 9/7/2014 8:09 PM, Digimer wrote:
I'm not so familiar with software RAID, but I would be surprised if there isn't a way to force write-through caching. If this is possible, then Valeri's concern can be addressed (at the cost of performance).
software raid on enterprise grade JBOD *is* write-through caching. the OS will only cache writes til an fsync/fdatasync/etc and then it will flush them to the md device, which will immediately flush them to the physical media. where it goes sideways is when you use cheap consumer grade desktop drives, those often lie about write complete to improve windows performance... but these would be a problem with or without mdraid, indeed, they would be a problem with hardware raid, too.
this is why I really like ZFS (on solaris and bsd, at least), because it timestamps and checksums every block it writes to disk... a conventional raid1, if the two copies don't match, you don't know which one is the 'right' one. the ZFS scrub process will check these timestamps and crc's, and correct the 'wrong' block.
I did a fair bit of informal(*) benchmarking of some storage systems at work before they were deployed. using a hardware raid card such as a LSI Megaraid 9260 with 2GB BBU cache, (or HP P410i or similar) is most certainly faster at transactional database style random read/write testing than using a simple SAS2 JBOD controller. But using mdraid with the Megaraid configured just as a bunch of disks, gave the same results if writeback caching was enabled in the controller. At different times, using different-but-similar SAS2 raid cards, I benchmarked 10-20 disk raids in various levels like 10, 5, 6, 50, and 60, built with 7200RPM SAS2 'nearline server' drives, 7200rpm SATA desktop drives, and 15000rpm SAS2 enterprise server drives. For an OLTP style database server under high concurrency and high transaction/second rates, raid10 with lots of 15k disks is definitely the way to go. for bulk file storage that's write-once and read-mostly, raid 5, 6, 60 perform adequately.
(*) my methodology was ad-hoc rather than rigorous, I primarily observed trends, so I can't publish any hard data to back these conclusions.. My tests including postgresql with pgbench, and various bonnie++ and iozone tests. most of these tests were on Xeon X5600 class servers with 8-12 cores, and 24-48GB ram.