[CentOS-virt] steadily increasing/high loadavg without i/o wait or cpu utilization

Fri Nov 20 09:07:52 UTC 2009
Günter Zimmermann <guenter.zimmermann at gmx.at>

Thank you for your reply. I could find just one process in D and this is
a delayed resync from another raid device. It is delayed because of the
big resync in progress.  I think this is not the problem. Could the 
running resync cause the high loadavg without showing up i/o wait or cpu
utilization in top?

[root at vserver ~]# ps -eo stat,command | awk '$1 ~ /^[DRZ]/{print}'
R<   [migration/0]
R<   [watchdog/0]
R<   [events/0]
R<   [kblockd/0]
R<   [md2_raid10]
R<   [md2_resync]
R<   [kondemand/0]
D<   [md0_resync]
R+   ps -eo stat,command

[root at vserver ~]# cat /proc/mdstat
Personalities : [raid10] [raid1]
md0 : active raid1 sda1[0] sdb1[1] sdc1[2] sdd1[3] sde1[4] sdf1[5]
sdg1[6] sdh1[7]
      136448 blocks [8/8] [UUUUUUUU]
          resync=DELAYED
     
md2 : active raid10 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2]
sdb3[1] sda3[0]
      3898460928 blocks 4K chunks 2 near-copies [8/8] [UUUUUUUU]
      [=========>...........]  resync = 47.9% (1870223296/3898460928)
finish=652.9min speed=51769K/sec
     
md1 : active raid10 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2]
sdb2[1] sda2[0]
      8031232 blocks 256K chunks 2 near-copies [8/8] [UUUUUUUU]



Christopher G. Stach II schrieb:
> ----- "Günter Zimmermann" <guenter.zimmermann at gmx.at> wrote:
>
>   
>> Hi all,
>>
>> I just installed centos 5.4 xen-kernel on intel core i5 machine as
>> dom0.
>> After some hours of syncing a raid10 array (8 sata disk) I noticed a
>> steadily increasing loadavg. I think without reasonable i/o wait or
>> cpu
>> utilization the loadavg on this system should be very lower. If this
>> loadavg is normal I would be greatful if somone could explain why.
>> The
>> screenshots below show that there is neither much i/o wait nor much
>> cpu
>> utilization.
>>     
>
> Do you have any zombie or D state processes? Try:
>
> ps -eo stat,command | awk '$1 ~ /^[DRZ]/{print}'
>
> If you have any in D, you can use SysRq-T and/or the pid's wchan in /proc to figure out what they're doing or dmesg to figure out where they may have barfed.
>
>