[CentOS] 3ware disk failure -> hang

Fri Jan 6 23:42:37 UTC 2006
Bryan J. Smith <thebs413 at earthlink.net>

David Finch <david at mytsoftware.com> wrote:
> Glad to know it's not just me.

It's not.  I've seen it too.

> Using ext3 on a 3ware raid 5 of four 250gb disks.

Doesn't matter what the disks are.  The problem is the small
cache size on the 3Ware Escalade 7000/8000.  They only have
1-4MB of 0 wait state SRAM.

1MB on the original 7200/7210/7400/7410/7800/7810 and 7000-2
and 8000-2, as well the 7006-2 and 8006-2 (xxx6 = 66MHz PCI).

2MB on the 7450/7850 which are, subsequently, the 7500-4/-8
for PATA now, with the 8500-4/-8 and 8506-4/-8 (xxx6 = 66MHz
PCI).

4MB on the 7506-12.

SRAM is very expensive, both size and cost-wise.  It's the
logic used in CPU cache and for networking ASICs.  But it has
little to no wait -- unlike DRAM which is still 40-70ns on
reads (many many wait cycles, typically 6-10 for today's
133-266MHz clocks).  That's why 3Ware calls the Escalade
7000/8000 series a "storage switch."  It's ideal for RAID-0,
1 and 10.

This size is a serious issue when it comes to Ext3's journal
logic, especially pre-2.4.18 kernels IIRC (maybe it was
2.4.15?).  With only 2MB typical (4MB on the 7506-12), the
commit of the Ext3 journal exceeds that size -- so the card
"stalls" on the write when just committing the journal from
the data.

> Writes are slow and seem to halt the server until they 
> complete, but it's not a server where response time or
> write speed is critical.

You can play with the kernel buffer settings.  It's highly
recommended for many of the 3Ware Escalade cards, including
the 9000 series.

But if performance is a consideration, do _not_ use RAID-5 on
the 3Ware Escalade 7000/8000.  Use RAID-10.  You can break
over 200MBps _writes_ with RAID-10 on the 7000/8000 series.


-- 
Bryan J. Smith     Professional, Technical Annoyance                      b.j.smith at ieee.org      http://thebs413.blogspot.com
----------------------------------------------------
*** Speed doesn't kill, difference in speed does ***