I can't seem to get the read and write performance better than approximately 40MB/s on an ext2 file system. IMO, this is horrible performance for a 6-drive, hardware RAID 5 array. Please have a look at what I'm doing and let me know if anybody has any suggestions on how to improve the performance...
System specs: ----------------- 2 x 2.8GHz Xeons 6GB RAM 1 3ware 9500S-12 2 x 6-drive, RAID 5 arrays with a stripe size of 256KB. Each array is 2.3TB after formatting. ioscheduler set to use the deadline scheduler.
mkfs.ext2 options used: ------------------------ mkfs.ext2 -b 4096 -L /d01 -m 1 -O sparse_super,dir_index -R stride=64 -T largefile /dev/sda1
I'm using a stride size of 64 since the ext2 block size is 4KB and the array stripe size is 256KB (256/4 = 64).
Output of using bonnie++: --------------------------- $ /usr/local/bonnie++/sbin/bonnie++ -d /d01/test -r 6144 -m anchor_ext2_4k_64s Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP anchor_ext2_4k_ 12G 41654 96 41937 11 30537 8 40676 88 233015 27 426.6 1 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 3369 83 +++++ +++ +++++ +++ 4142 100 +++++ +++ 11728 100
Cheers! Sean
On Mon, 31 Oct 2005 at 4:10pm, Sean Staats wrote
I can't seem to get the read and write performance better than approximately 40MB/s on an ext2 file system. IMO, this is horrible performance for a 6-drive, hardware RAID 5 array. Please have a look at what I'm doing and let me know if anybody has any suggestions on how to improve the performance...
System specs:
2 x 2.8GHz Xeons 6GB RAM 1 3ware 9500S-12 2 x 6-drive, RAID 5 arrays with a stripe size of 256KB. Each array is 2.3TB after formatting. ioscheduler set to use the deadline scheduler.
Make sure you are using the absolute latest firmware for the 3ware board (2.08.00.005). There is a bad cache allocation bug in previous firmwares that kills performance with more than one unit (as you have).
Sean Staats sstaats@questia.com wrote:
I can't seem to get the read and write performance better than approximately 40MB/s on an ext2 file system. IMO, this is horrible performance for a 6-drive, hardware RAID 5 array. 2 x 2.8GHz Xeons 6GB RAM 1 3ware 9500S-12
You should be getting much, much higher than 40MBps reads on any 3Ware controller. As far as writes, there were some bugs in the early 9.2 firmware that should be cleared up. You shouldn't be seeing anything that slow on a 9500S.
It's not the filesystem. It's probably the card/array configuration.
2 x 6-drive, RAID 5 arrays
Again, the early 9.2 firmware had a performance bug with multiple volumes. What is your firmware?
with a stripe size of 256KB.
Any reason you went with a 256KiB stripe size? I typically stick with the 32KiB default.
blockdev --setra 16384 /dev/sda
Sean Staats wrote:
I can't seem to get the read and write performance better than approximately 40MB/s on an ext2 file system. IMO, this is horrible performance for a 6-drive, hardware RAID 5 array. Please have a look at what I'm doing and let me know if anybody has any suggestions on how to improve the performance...
System specs:
2 x 2.8GHz Xeons 6GB RAM 1 3ware 9500S-12 2 x 6-drive, RAID 5 arrays with a stripe size of 256KB. Each array is 2.3TB after formatting. ioscheduler set to use the deadline scheduler.
mkfs.ext2 options used:
mkfs.ext2 -b 4096 -L /d01 -m 1 -O sparse_super,dir_index -R stride=64 -T largefile /dev/sda1
I'm using a stride size of 64 since the ext2 block size is 4KB and the array stripe size is 256KB (256/4 = 64).
Output of using bonnie++:
$ /usr/local/bonnie++/sbin/bonnie++ -d /d01/test -r 6144 -m anchor_ext2_4k_64s Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP anchor_ext2_4k_ 12G 41654 96 41937 11 30537 8 40676 88 233015 27 426.6 1 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 3369 83 +++++ +++ +++++ +++ 4142 100 +++++ +++ 11728 100
Cheers! Sean
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Quoting Sean Staats sstaats@questia.com:
I can't seem to get the read and write performance better than approximately 40MB/s on an ext2 file system. IMO, this is horrible performance for a 6-drive, hardware RAID 5 array. Please have a look at what I'm doing and let me know if anybody has any suggestions on how to improve the performance... Output of using bonnie++:
[snip]
$ /usr/local/bonnie++/sbin/bonnie++ -d /d01/test -r 6144 -m anchor_ext2_4k_64s Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP anchor_ext2_4k_ 12G 41654 96 41937 11 30537 8 40676 88 233015 27 426.6 1
Correct me if I'm wrong, but you got 233MB/s for reads (the block read test). Assuming your disks can do 50MB/s sustained transfer rate each, you are preatty darn close to the theoretical maximum of (6 - 1) * 50MB/s = 250MB/s for 6 disk RAID5.
RAID5 as such is bad choice for file systems that will have more than about 30% of writes (out of total I/O). If most of the I/O will be writes, and you care about performance, you should use RAID-10. Remember, writes to Dumb, not-optimized RAID5 implementation is slower than writing to a single disk. This is generic RAID wisdom, nothing to do with any particular implementation. In the worst case scenario, the write operation on 6-disk RAID5 volume involves reading a data block from 5 drives, calculating XOR, and writing back one block of data and one block of checksum. Whichever way you do it, it ain't gonna be fast.
For large sequential writes, RAID5 implementations can do a lot of optimizations (reducing the number of reads for each write operation). But they still need to generate and write that additional XOR checksum, so it is going to be slower than reading from that same volume.
The random writes to RAID5 volumes are always going to be terribly slow since RAID5 implementation can't optimize them very well. If they are limited to small areas of data, large battery backedup on-controller cache might help (since the blocks needed to re-calculate XOR checksum might already be in the cache, and the actuall writes can be delayed in hope there'll be enough data to write in the future to reduce number of needed reads). If they are spread all over 1TB volume, you are screwed, no (reasonable) amount of cache is going to save ya.
Back to reads, you got 40MB/s for per-chr reads and 233MB/s for block reads. The difference between this two cases is not 3ware related, at least not in your case it seems. The per-chr test is reading one byte at a time, and it is influenced by three factors: how well the C library is optimized (and how good it is in buffering), the CPU speed and the disk speed. If you look at CPU column, you'll see that your CPU was 88% busy during this test (probably most time spent in a bonnie's loop that executes 12 billion getc() and the C library itself). So no matter how fast your drives are, you'd max out in per-chr read test at maybe 45-50MB/s with the CPU you have in the box. Setting larger read ahead (as Joe suggested) might help to squeeze couple of MB/s more in benchmark tests, but probably not really worth it in real world applications.
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
Aleksandar Milivojevic alex@milivojevic.org wrote:
Correct me if I'm wrong, but you got 233MB/s for reads (the block read test).
Oh, good catch! I didn't even see that when responding (I assumed he could interpret the bonnie benchmark). And if I see that correctly, that was with a 12GiB file (on a system that had 6GiB RAM).
Assuming your disks can do 50MB/s sustained transfer rate each, you are preatty darn close to the theoretical maximum of (6 - 1) * 50MB/s = 250MB/s for 6 disk RAID5.
On reads, yes. 3Ware is clearly leveraging the ASIC's non-blocking I/O for reads from RAID-5, which basically act like RAID-0.
RAID5 as such is bad choice for file systems that will have more than about 30% of writes (out of total I/O).
He still should be seeing at least 100MBps for RAID-5 writes on a 3Ware Escalade 9500S with 6-discs (180MBps is about the maximum for RAID-5 writes on the 9500S' ASIC with DRAM). The ASIC is fairly good at sequential writes to RAID-5, and there is enough DRAM to buffer all but the heaviest of random I/O.
Still, the new 9550SX series has a PowerPC. AMCC's influence is clearly being pressed on their 3Ware acquisition, as they are _the_ company for the IBM embedded PowerPC 400 line now. The 9550SX is supposed to be cable of 380MBps for RAID-5 writes -- double the 9500S best benchmarks.
If most of the I/O will be writes, and you care about performance, you should use RAID-10.
Yep, mega-dittos on that point.
Remember, writes to Dumb, not-optimized RAID5 implementation is slower than writing to a single disk. This is generic RAID wisdom, nothing to do with any particular implementation. In the worst case scenario, the write operation on 6-disk RAID5 volume involves reading a data block from 5 drives, calculating XOR, and writing back one block of data and one block of checksum. Whichever way you do it, it ain't gonna be fast.
Still, he shouldn't be seeing less than 100MBps writes on the 3Ware Escalade 9500S series with its on-board ASIC and DRAM buffer.
At least the reads are very accurate for his configuration. I'm curious how he is striping though? It might have been better to do a 12-disc RAID-5 and get close to 400MBps reads.
Or if performance was more important than efficiency, making one 6-disc volume RAID-10 would give close to 300MBps reads, 150MBps writes -- maybe higher.
Quoting "Bryan J. Smith" thebs413@earthlink.net:
He still should be seeing at least 100MBps for RAID-5 writes on a 3Ware Escalade 9500S with 6-discs (180MBps is about the maximum for RAID-5 writes on the 9500S' ASIC with DRAM). The ASIC is fairly good at sequential writes to RAID-5, and there is enough DRAM to buffer all but the heaviest of random I/O.
What I found with an old(er) 3ware 7500-8 (does not use same device driver as 9xxx cards) in RAID5 configuration was that it makes big difference using ext2 or ext3 (doubles the write speed, no effect on read speed). With ext3 I used internal journal (external migh have helped, but haven't tested it). Changing journaling options and/or journal size had almost no effect. Anyhow, journaling (using default options, internal journal) should not have that high impact on write speed (not even close).
The card was considerably faster with 2.4 kernel than with 2.6 kernel (tests run on same hardware, same configuration, ext3 file system). About 20% faster writes and 40% faster reads.
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
Aleksandar Milivojevic alex@milivojevic.org wrote:
What I found with an old(er) 3ware 7500-8 (does not use same device driver as 9xxx cards) in RAID5 configuration was that it makes big difference using ext2 or ext3 (doubles the write speed, no effect on read speed).
Of course.
3Ware pairs its 64-bit ASIC in the 7000+ series with 1-4MiB of 0 wait state SRAM (Static RAM). That gives you the utmost in non-blocking JBOD, RAID-0, 1 and 10 performance. That's won't cache much more than a few (standard) 32KiB blocks -- definitely not ideal for any multi-staged writes (such as journaling).
The 9500S adds 128+MiB of multi-wait state SDRAM (Synchronous DRAM) which can buffer a lot more.
The 9550SX actually now splits the design into the legacy, non-blocking ASIC+SRAM plus a new embedded PowerPC 400 series with its own 128+MiB of DDR2 SDRAM for the ultimate in a buffering controller card. When RAID-5 is used, or extensive buffering is needed, the 64-bit ASIC (which is also the bus arbitrator) switches the incoming stream into the SDRAM which is then serviced by the embedded PowerPC 400 series.
With ext3 I used internal journal (external migh have helped, but haven't tested it). Changing journaling options and/or journal size had almost no effect. Anyhow, journaling (using default options, internal journal) should not have that high impact on write speed (not even close).
Again, considering the fact that the 7000/8000 series have an extremely small -- only 1-4MiB -- "0 wait state cache" instead of a much larger amount of "multi-wait state SDRAM buffer," this is not unexplained. 3Ware 7000/8000 series want to stream sequential writes -- especially when it comes to RAID-5. If not, it stalls.
The card was considerably faster with 2.4 kernel than with 2.6 kernel (tests run on same hardware, same configuration, ext3 file system). About 20% faster writes and 40% faster reads.
Hit 3Ware's site on optimizing the kernel 2.6 settings for the card.
And be sure to get the latest firmware for the 9500S -- that makes all the difference!
The 7000/8000 series firmware has been mature for years at 7.7.1 last time I checked.
If the new 9550SX is any suggestion, the 64-bit ASIC design is just not going to cut it at RAID-5 writes versus a full microcontroller.
AS a result, I can't recommend the 9500S. The verdict is still out on the 9550SX. But there is much promise thanx to AMCC. They _know_ the embedded PowerPC 400 series in and out.
Tom's Hardware Review just did a recent I/O queuing comparison, not actually benchmarks or CPU-interconnect load comparisons. It was rather limited in anything, although the embedded PowerPC-based 9550SX challenged the new X-Scale based Aerca's to keep up (and the X-Scale based LSI 300-8X wasn't exactly as good).
On Tue, 1 Nov 2005 at 11:43am, Aleksandar Milivojevic wrote
What I found with an old(er) 3ware 7500-8 (does not use same device driver as 9xxx cards) in RAID5 configuration was that it makes big difference using ext2 or ext3 (doubles the write speed, no effect on read speed). With ext3 I used internal journal (external migh have helped, but haven't tested it).
I tested with an external journal on a hardware (8506-2) RAID1 of 2 WD Raptor drives, and it made no difference. What *did* make a difference was using XFS (via the centosplus kernel), but I didn't trust that in production. I ended up going to software RAID, which got good local speeds. I'm discovering now, though, that the NFS performance sucks.
First of all, thanks to everybody for their responses on this thread.
On Tue, 2005-11-01 at 10:31, centos-bounces@centos.org wrote:
Aleksandar Milivojevic alex@milivojevic.org wrote:
Correct me if I'm wrong, but you got 233MB/s for reads (the block read test).
Oh, good catch! I didn't even see that when responding (I assumed he could interpret the bonnie benchmark). And if I see that correctly, that was with a 12GiB file (on a system that had 6GiB RAM).
I was fixated on the per char read rate and didn't pay much attention to the block read rate. ;-)
Assuming your disks can do 50MB/s sustained transfer rate each, you are preatty darn close to the theoretical maximum of (6 - 1) * 50MB/s = 250MB/s for 6 disk RAID5.
At least the read speeds are performing as can be reasonably expected for this particular configuration. I am certainly happy with that result.
On reads, yes. 3Ware is clearly leveraging the ASIC's non-blocking I/O for reads from RAID-5, which basically act like RAID-0.
RAID5 as such is bad choice for file systems that will have more than about 30% of writes (out of total I/O).
He still should be seeing at least 100MBps for RAID-5 writes on a 3Ware Escalade 9500S with 6-discs (180MBps is about the maximum for RAID-5 writes on the 9500S' ASIC with DRAM). The ASIC is fairly good at sequential writes to RAID-5, and there is enough DRAM to buffer all but the heaviest of random I/O.
Still, the new 9550SX series has a PowerPC. AMCC's influence is clearly being pressed on their 3Ware acquisition, as they are _the_ company for the IBM embedded PowerPC 400 line now. The 9550SX is supposed to be cable of 380MBps for RAID-5 writes -- double the 9500S best benchmarks.
If most of the I/O will be writes, and you care about performance, you should use RAID-10.
Yep, mega-dittos on that point.
Remember, writes to Dumb, not-optimized RAID5 implementation is slower than writing to a single disk. This is generic RAID wisdom, nothing to do with any particular implementation. In the worst case scenario, the write operation on 6-disk RAID5 volume involves reading a data block from 5 drives, calculating XOR, and writing back one block of data and one block of checksum. Whichever way you do it, it ain't gonna be fast.
Still, he shouldn't be seeing less than 100MBps writes on the 3Ware Escalade 9500S series with its on-board ASIC and DRAM buffer.
I'm going to upgrade the firmware to the latest revision which should improve the write performance.
At least the reads are very accurate for his configuration. I'm curious how he is striping though? It might have been better to do a 12-disc RAID-5 and get close to 400MBps reads.
Or if performance was more important than efficiency, making one 6-disc volume RAID-10 would give close to 300MBps reads, 150MBps writes -- maybe higher.
I'm sticking with RAID-5 to maximize storage space while having some level of protection against drive failure.
Cheers! -Sean
Sean Staats sstaats@questia.com wrote:
I'm sticking with RAID-5 to maximize storage space while having some level of protection against drive failure.
If you're maximizing for storage space, then don't expect anywhere near the best write performance. ;->