[CentOS] how to optimize CentOS XEN dom0?

Fri Feb 25 09:29:01 UTC 2011
Rudi Ahlers <Rudi at SoftDux.com>

On Wed, Feb 23, 2011 at 4:33 PM, Ross Walker <rswwalker at gmail.com> wrote:
> On Feb 23, 2011, at 3:42 AM, Rudi Ahlers <Rudi at SoftDux.com> wrote:
>
>> On Wed, Feb 23, 2011 at 9:06 AM, yonatan pingle
>> <yonatan.pingle at gmail.com> wrote:
>>> you should have a look at your I/O disk status.
>>>
>>> try with iostat -dx 5 to see the disk utilization info over time.
>>> when it comes to slowdown on a virtual environment on a Desktop grade
>>> machine,  i suspect disk I/O latency and bottleneck as a cause.
>>
>> Thanx, I don't know how to interpret the results (yet), but here's the
>> current output:
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>
> Knowing the columns helps here,
>
> rrqm/s and wrqm/s, mean read/write requests merged a second, shows how well scheduler is merging contiguous io operations
>
> r/s and w/s, read/write io operations a second
>
> rsec/s and wsec/s, read/write sectors a second, I usually use the -k option so it displays as kilobytes a second
>
> avgrq-sz, shows average request size in the unit of choice, here being sectors, I wish it'd separate reads from writes, but oh well
>
> avgqu-sz, average amount of io operations waiting for service, smaller is better
>
> await, average time an io operation waited on queue to be serviced in ms, again smaller is better
>
> svctm, last time it took to service an io operation, how long the drive took to perform the operation from when it left queue to when a result was returned
>
> %util, the estimated drive utilization based on svctm, await and avgqu-sz
>
> For lockups though I'd look at dmesg and xen log, xmlog I think is the command.
>
> The number one reason for lockups though is most likely memory contention between domUs and dom0.
>
> What are you running in dom0? What are your memory reservations like?
>
> -Ross
> _______________________________________________


I see a lot of these errors in /var/log/messages shortly before it crashed:



Feb 22 15:27:14 zaxen01 kernel: HighMem: empty
Feb 22 15:27:14 zaxen01 kernel: 918 pagecache pages
Feb 22 15:27:14 zaxen01 kernel: Swap cache: add 2248198, delete
2248009, find 160685591/160898897, race 0+453
Feb 22 15:27:14 zaxen01 kernel: Free swap  = 0kB
Feb 22 15:27:14 zaxen01 kernel: Total swap = 4194296kB
Feb 22 15:27:14 zaxen01 kernel: Free swap:            0kB
Feb 22 15:27:14 zaxen01 kernel: 133120 pages of RAM
Feb 22 15:27:14 zaxen01 kernel: 22818 reserved pages
Feb 22 15:27:16 zaxen01 kernel: 105840 pages shared
Feb 22 15:27:16 zaxen01 kernel: 189 pages swap cached
Feb 22 15:27:17 zaxen01 kernel: Out of memory: Killed process 17464,
UID 99, (sendmail).
Feb 23 00:35:38 zaxen01 syslogd 1.4.1: restart.
Feb 23 00:35:38 zaxen01 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Feb 23 00:35:38 zaxen01 kernel: Bootdata ok (command line is ro
root=/dev/System/root rhgb quiet xencons=tty6)
Feb 23 00:35:38 zaxen01 kernel: Linux version 2.6.18-194.32.1.el5xen
(mockbuild at builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat
4.1.2-48)) #1 SMP
Wed Jan 5 18:44:24 EST 2011
Feb 23 00:35:38 zaxen01 kernel: BIOS-provided physical RAM map:
Feb 23 00:35:38 zaxen01 kernel:  Xen: 0000000000000000 -
0000000020800000 (usable)
Feb 23 00:35:38 zaxen01 kernel: DMI 2.4 present.
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x01]
lapic_id[0x00] enabled)
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x03]
lapic_id[0x02] enabled)
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x02]
lapic_id[0x01] enabled)
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x04]
lapic_id[0x03] enabled)
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl
lint[0x1])
Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl
lint[0x1])
Feb 23 00:35:38 zaxen01 kernel: ACPI: IOAPIC (id[0x02]
address[0xfec00000] gsi_base[0])
Feb 23 00:35:38 zaxen01 kernel: IOAPIC[0]: apic_id 2, version 32,
address 0xfec00000, GSI 0-23
Feb 23 00:35:38 zaxen01 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0
global_irq 2 dfl dfl)
Feb 23 00:35:38 zaxen01 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9
global_irq 9 high level)
Feb 23 00:35:38 zaxen01 kernel: Setting APIC routing to xen
Feb 23 00:35:38 zaxen01 kernel: Using ACPI (MADT) for SMP
configuration information
Feb 23 00:35:38 zaxen01 kernel: Allocating PCI resources starting at
d4000000 (gap: d0000000:2ff00000)
Feb 23 00:35:38 zaxen01 kernel: Built 1 zonelists.  Total pages: 133120
Feb 23 00:35:38 zaxen01 kernel: Kernel command line: ro
root=/dev/System/root rhgb quiet xencons=tty6
Feb 23 00:35:38 zaxen01 kernel: Initializing CPU#0
Feb 23 00:35:38 zaxen01 kernel: PID hash table entries: 4096 (order:
12, 32768 bytes)






@Ross,

dom0 is a XEN host for CloudMin,so it runs Apache + Webmin, MySQL & Exim

-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532