[CentOS] disk I/O problems and Solutions

Fri Oct 9 19:20:50 UTC 2009
Martin Suehowicz <msuehowicz at rubiconproject.com>

Raid10 should be better on your writes. Random reads and writes are the
most important for a db. For random io the more spindles(disks) you have
the better. I would use sas over sata if possible. How big is your
database? If it is small you may be able to put it on a few solid state
drives there really good for random io. I don't know of a website, but
one would be nice. 

-----Original Message-----
From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
Behalf Of Alan McKay
Sent: Friday, October 09, 2009 9:45 AM
To: Alan McKay
Subject: [CentOS] disk I/O problems and Solutions

Hey folks,

CentOS / PostgreSQL shop over here.

I'm hitting 3 of my favorite lists with this, so here's hoping that
the BCC trick is the right way to do it :-)

We've just discovered thanks to a new Munin plugin
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-wi
th-munin.html
that our production DB is completely maxing out in I/O for about a 3
hour stretch from 6am til 9am
This is "device utilization" as per the last graph at the above link.

Load went down for a while but is now between 70% and 95% sustained.
We've only had this plugin going for less than a day so I don't really
 have any more data going back further.  But we've suspected a disk
issue for some time - just have not been able to prove it.

Our system
IBM 3650 - quad 2Ghz e5405 Xeon
8K SAS RAID Controller
6 x 300G 15K/RPM SAS Drives
/dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
/dev/sdb - 3 drives configured as RAID5 for 600G for the DB
1 drive as a global hot spare

/dev/sdb is the one that is maxing out.

We need to have a very serious look at fixing this situation.   But we
don't have the money to be experimenting with solutions that won't
solve our problem.  And our budget is fairly limited.

Is there a public library somewhere of disk subsystems and their
performance figures?  Done with some semblance of a standard
benchmark?

One benchmark I am partial to is this one :
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmark
ing_notes#dd_test

One thing I am thinking of in the immediate term is taking the RAID5 +
hot spare and converting it to RAID10 with the same amount of storage.
 Will that perform much better?

In general we are planning to move away from RAID5 toward RAID10.

We also have on order an external IBM array (don't have the exact name
on hand but model number was 3000) with 12 drive bays.  We ordered it
with just 4 x SATAII drives, and were going to put it on a different
system as a RAID10.  These are just 7200 RPM drives - the goal was
cheaper storage because the SAS drives are about twice as much per
drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
the SATA2 drives are about $200 each and the SAS 300G drives about
$500 each.

So I have 2 thoughts with this 12 disk array.   1 is to fill it up
with 12 x cheap SATA2 drives and hope that even though the spin-rate
is a lot slower, that the fact that it has more drives will make it
perform better.  But somehow I am doubtful about that.   The other
thought is to bite the bullet and fill it up with 300G SAS drives.

any thoughts here?  recommendations on what to do with a tight budget?
  It could be the answer is that I just have to go back to the bean
counters and tell them we have no choice but to start spending some
real money.  But on what?  And how do I prove that this is the only
choice?


-- 
"Don't eat anything you've ever seen advertised on TV"
         - Michael Pollan, author of "In Defense of Food"
_______________________________________________
CentOS mailing list
CentOS at centos.org
http://lists.centos.org/mailman/listinfo/centos