On Thu, Mar 11, 2010 at 11:02:33AM -0800, Mathew S. McCarrell wrote: > Is anyone else having major I/O peaks due to logrotate or other jobs > running simultaneously across multiple guests. I have one KVM server > running Centos 5.4 with local disk that is seriously suffering as > most of the guests rotate their syslog at the same time. > > Looking at the KVM server I'm seeing > > 11:00:01 PM CPU %user %nice %system %iowait %steal %idle > 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 > 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 > 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 > 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 > 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 > 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 > 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 > 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69 > > On one of the guests running Centos 4.6 the impact is so bad I get > DMA timeout errors in the syslog, and occasional kernel panics. > > Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:05:14 localhost kernel: hda: DMA timeout error > Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > Mar 11 04:05:14 localhost kernel: > Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown > Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:06:14 localhost kernel: hda: DMA timeout error > Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > > One reference I've found is at > * http://lonesysadmin.net/linux-virtual-machine-tuning-guide/ > > This suggests avoiding running scheduled jobs simultaneously across > guests, and suggests using a random sleep. I think this is a pretty good suggestion. > > Does anyone else have suggestions on reducing the impact of > cron/logrotate. You might also consider increasing the device timeouts on your block devices at the guest level: echo 120 > /sys/block/sda/device/timeout etc, etc. That or increase the performance of your storage :) > > I ran into this issue as well on a box running Xen with local > storage. > > My solution was to modify /etc/crontab to run /etc/cron.weekly at > different times for each guest and for the dom0. I modified the > entry on each VM to be 10 minutes after the previous one and have not > seen any load spikes since then. > > Matt Ray