-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Jim Perrin Sent: Friday, January 05, 2007 7:24 PM To: CentOS mailing list Subject: Re: [CentOS] Disk Elevator
On 1/5/07, Matt lm7812@gmail.com wrote:
Can anyone explain how the disk elevator works and if there
is anyway
to tweak it? I have an email server which likely has a large number of read and write requests and was wandering if there was anyway to improve performance.
Reasonably decent writeup. Gives a good overview, but I'm not sure how much detail you'd like. http://www.redhat.com/magazine/008jun05/features/schedulers/
The disk elevators or io schedulers are there to minimize head seek by re-ordering and merging requests to read or write data from common areas of the disk.
There are some tweaks to improve performance, but the performance gains are minimal on a raid array (the elevators do not not stripe size as they were implemented with single-spindle drives in mind).
The biggest performance gain you can achieve on a raid array is to make sure you format the volume aligned to your raid stripe size. For example if you have a 4 drive raid 5 and it is using 64K chunks, your stripe size will be 256K. Given a 4K filesystem block size you would then have a stride of 64 (256/4), so when you format your volume:
Mke2fs -E stride=64 (other needed options -j for ext3, -N <# of inodes> for extended # of i-nodes, -O dir_index speeds up directory searches for large # of files) /dev/XXXX
By aligning the file-system to the array stripe size you can minimize short write penalties to your array which will speed up writes. By using the -O dir_index option you can speed up reads a fraction, but by minimizing the write penalties reads will gain performance anyway.
A short write penalty is when data is written to an array that is shorter then the stripe (256K) then the remaining blocks will need to be read from the stripe in order to compute a new parity for the stripe. If the OS knows the stripe size then each stripe can be cached before hand in a read-ahead so when a write comes it should have all the data it needs to write the full stripe to disk. It can also give hints to the page cache for combining separate io that falls in the same stripe.
-Ross
______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof.