[CentOS] disk I/O problems and Solutions

Pasi Kärkkäinen pasik at iki.fi
Fri Oct 9 18:19:56 UTC 2009


On Fri, Oct 09, 2009 at 12:45:14PM -0400, Alan McKay wrote:
> Hey folks,
> 
> CentOS / PostgreSQL shop over here.
> 
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)
> 
> We've just discovered thanks to a new Munin plugin
> http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above link.
> 
> Load went down for a while but is now between 70% and 95% sustained.
> We've only had this plugin going for less than a day so I don't really
>  have any more data going back further.  But we've suspected a disk
> issue for some time - just have not been able to prove it.
> 
> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller
> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
> 
> /dev/sdb is the one that is maxing out.
> 
> We need to have a very serious look at fixing this situation.   But we
> don't have the money to be experimenting with solutions that won't
> solve our problem.  And our budget is fairly limited.
> 
> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?
> 
> One benchmark I am partial to is this one :
> http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test
> 
> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?
> 

Does your RAID controller have battery-backed write-cache? That's needed
to get good performance from RAID5. 

Changing to RAID-10 will definitely help. 

> In general we are planning to move away from RAID5 toward RAID10.
> 

That's good. RAID5 isn't very good for database use..

> We also have on order an external IBM array (don't have the exact name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.
> 
> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.
> 
> any thoughts here?  recommendations on what to do with a tight budget?
>   It could be the answer is that I just have to go back to the bean
> counters and tell them we have no choice but to start spending some
> real money.  But on what?  And how do I prove that this is the only
> choice?
> 

I'd use ltp disktest to measure the performance using various
workloads.. sequential and random, using different amount of threads and
different blocksizes.

There's only so many random IOPS each disk can do.. 15k SAS drive will
be 2-3x faster than 7200 rpm SATA disk, but still 4 disks total isn't
THAT much.. 

-- Pasi




More information about the CentOS mailing list