Is anyone else having major I/O peaks due to logrotate or other jobs running simultaneously across multiple guests. I have one KVM server running Centos 5.4 with local disk that is seriously suffering as most of the guests rotate their syslog at the same time.
Looking at the KVM server I'm seeing
11:00:01 PM CPU %user %nice %system %iowait %steal %idle 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69
On one of the guests running Centos 4.6 the impact is so bad I get DMA timeout errors in the syslog, and occasional kernel panics.
Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:05:14 localhost kernel: hda: DMA timeout error Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } Mar 11 04:05:14 localhost kernel: Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:06:14 localhost kernel: hda: DMA timeout error Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
One reference I've found is at * http://lonesysadmin.net/linux-virtual-machine-tuning-guide/
This suggests avoiding running scheduled jobs simultaneously across guests, and suggests using a random sleep.
Does anyone else have suggestions on reducing the impact of cron/logrotate.
Steven Ellis wrote:
This suggests avoiding running scheduled jobs simultaneously across guests, and suggests using a random sleep.
Does anyone else have suggestions on reducing the impact of cron/logrotate.
Setup a syslog server and have all your machines send their logging there instead of keeping them locally on each machine.
On Wed, Mar 10, 2010 at 5:28 PM, Steven Ellis steven.ellis@bulletin.netwrote:
Is anyone else having major I/O peaks due to logrotate or other jobs running simultaneously across multiple guests. I have one KVM server running Centos 5.4 with local disk that is seriously suffering as most of the guests rotate their syslog at the same time.
Looking at the KVM server I'm seeing
11:00:01 PM CPU %user %nice %system %iowait %steal %idle 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69
On one of the guests running Centos 4.6 the impact is so bad I get DMA timeout errors in the syslog, and occasional kernel panics.
Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:05:14 localhost kernel: hda: DMA timeout error Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } Mar 11 04:05:14 localhost kernel: Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:06:14 localhost kernel: hda: DMA timeout error Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
One reference I've found is at
This suggests avoiding running scheduled jobs simultaneously across guests, and suggests using a random sleep.
Does anyone else have suggestions on reducing the impact of cron/logrotate.
I ran into this issue as well on a box running Xen with local storage.
My solution was to modify /etc/crontab to run /etc/cron.weekly at different times for each guest and for the dom0. I modified the entry on each VM to be 10 minutes after the previous one and have not seen any load spikes since then.
Matt
-- Mathew S. McCarrell Clarkson University '10
mccarrms@gmail.com mccarrms@clarkson.edu 1-518-314-9214
On Thu, Mar 11, 2010 at 11:02:33AM -0800, Mathew S. McCarrell wrote:
Is anyone else having major I/O peaks due to logrotate or other jobs running simultaneously across multiple guests. I have one KVM server running Centos 5.4 with local disk that is seriously suffering as most of the guests rotate their syslog at the same time.
Looking at the KVM server I'm seeing
11:00:01 PM CPU %user %nice %system %iowait %steal %idle 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69
On one of the guests running Centos 4.6 the impact is so bad I get DMA timeout errors in the syslog, and occasional kernel panics.
Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:05:14 localhost kernel: hda: DMA timeout error Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } Mar 11 04:05:14 localhost kernel: Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:06:14 localhost kernel: hda: DMA timeout error Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
One reference I've found is at
This suggests avoiding running scheduled jobs simultaneously across guests, and suggests using a random sleep.
I think this is a pretty good suggestion.
Does anyone else have suggestions on reducing the impact of cron/logrotate.
You might also consider increasing the device timeouts on your block devices at the guest level:
echo 120 > /sys/block/sda/device/timeout
etc, etc. That or increase the performance of your storage :)
I ran into this issue as well on a box running Xen with local storage.
My solution was to modify /etc/crontab to run /etc/cron.weekly at different times for each guest and for the dom0. I modified the entry on each VM to be 10 minutes after the previous one and have not seen any load spikes since then.
Matt
Ray