[CentOS] disk I/O problems and Solutions

Alan McKay alan.mckay at gmail.com
Fri Oct 9 17:45:14 UTC 2009


Hey folks,

CentOS / PostgreSQL shop over here.

I'm hitting 3 of my favorite lists with this, so here's hoping that
the BCC trick is the right way to do it :-)

We've just discovered thanks to a new Munin plugin
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
that our production DB is completely maxing out in I/O for about a 3
hour stretch from 6am til 9am
This is "device utilization" as per the last graph at the above link.

Load went down for a while but is now between 70% and 95% sustained.
We've only had this plugin going for less than a day so I don't really
 have any more data going back further.  But we've suspected a disk
issue for some time - just have not been able to prove it.

Our system
IBM 3650 - quad 2Ghz e5405 Xeon
8K SAS RAID Controller
6 x 300G 15K/RPM SAS Drives
/dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
/dev/sdb - 3 drives configured as RAID5 for 600G for the DB
1 drive as a global hot spare

/dev/sdb is the one that is maxing out.

We need to have a very serious look at fixing this situation.   But we
don't have the money to be experimenting with solutions that won't
solve our problem.  And our budget is fairly limited.

Is there a public library somewhere of disk subsystems and their
performance figures?  Done with some semblance of a standard
benchmark?

One benchmark I am partial to is this one :
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test

One thing I am thinking of in the immediate term is taking the RAID5 +
hot spare and converting it to RAID10 with the same amount of storage.
 Will that perform much better?

In general we are planning to move away from RAID5 toward RAID10.

We also have on order an external IBM array (don't have the exact name
on hand but model number was 3000) with 12 drive bays.  We ordered it
with just 4 x SATAII drives, and were going to put it on a different
system as a RAID10.  These are just 7200 RPM drives - the goal was
cheaper storage because the SAS drives are about twice as much per
drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
the SATA2 drives are about $200 each and the SAS 300G drives about
$500 each.

So I have 2 thoughts with this 12 disk array.   1 is to fill it up
with 12 x cheap SATA2 drives and hope that even though the spin-rate
is a lot slower, that the fact that it has more drives will make it
perform better.  But somehow I am doubtful about that.   The other
thought is to bite the bullet and fill it up with 300G SAS drives.

any thoughts here?  recommendations on what to do with a tight budget?
  It could be the answer is that I just have to go back to the bean
counters and tell them we have no choice but to start spending some
real money.  But on what?  And how do I prove that this is the only
choice?


-- 
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"


More information about the CentOS mailing list