[CentOS-virt] SATA vs RAID5 vs VMware

Fri Sep 25 13:30:38 UTC 2009
Filipe Brandenburger <filbranden at gmail.com>

Hi,

On Thu, Sep 24, 2009 at 21:18, Philip Gwyn <liste at artware.qc.ca> wrote:
> The problem seems like a disk problem.  I grow to suspect that SATA isn't ready
> for the big time.  I also grow to dislike RAID5.
>
> Questions :
> - Anyone have a clue or other on how to track down my bottle neck?

You can use the command "iostat -kx 1 /dev/sd?" which will give you
more information of what is happening, in particular it will show
%util which will show how often the drive is busy, and you can
correlate that with the rkB/s and wkB/s to see how much data is being
read or written to that specific drive. You also have averages for the
request size (to know if you have many small operations or a few big
ones), queue size, service time and wait time. See "man iostat" for
more details. It's not installed by default on CentOS 5 but it's
available from the base repositories, just run "yum install sysstat"
if you don't have it yet.

If you are using RAID-5 you might want to see if the chunk size you
are using is good. You can specify that when you create a new array
using the "-c" option to mdadm. I don't think you can change that
after it's created. The default is 64kB which sounds sane enough but
you might want to check if yours was created with that value or not.

The problem is basically if you have big operations that are larger
than the chunk size it will require operations on all the disks which
means all of them will have to seek to a specific position to complete
your operation, and while they are doing that they will not be able to
work on any other requests. If you have high usage and random access
the disks will spend a lot of time seeking. If that is the case, you
might want to increase the chunk size so that most operations can be
fulfilled by one disk only so that the others are free to work on
other requests at that time.

On the other hand, if you have specific areas of your filesystem that
are hit more often that fall always on the same disk, that disk will
be used more than the other ones, so your performance will be
effectively limited by that one disk instead of multiplied by the
number of disks due to the striped access. In that case it might make
sense to reduce the chunk size in order to make the access more even
across disks. I read sometime ago that ext2/ext3 has a way of
allocating blocks that will create such unfair distribution when you
are striping across a certain number of disks, I don't know exactly
how that works but you might want to check into that. I remember that
when you create the ext2/ext3 filesystem you can use an option such as
"stride=..." to give a hint on the disk layout so that the filesystem
can disalign those blocks enough to spread the load across the disks.
But I remember I could never exactly figure out what "stride=..."
number would make sense to me... the documentation is kind of scarce
in this area, but check the mke2fs manpage anyway if you have a disk
that is more "hot" than the others and you think that might be the
problem. You can also experiment with other filesystems such as XFS
which is available in the extras repository.

And of course, make sure "cat /proc/mdstat" shows everything OK, make
sure you aren't running a degraded array before you start
investigating its performance.

I'm sure there are performance tunings that can be done with, e.g.,
hdparm, tweaking numbers in /proc and /sys filesystems, or changing
the kernel scheduler, but I'm not really experienced with that so I
couldn't really advise you on that. I'm sure others will have such
experience and will be able to give you pointers on that. You might
want to ask on the main list in that case, instead of the -virt one.

HTH,
Filipe