Hi,
I have a problematic CentOS XEN server and hope someone could point me in the right direction to optimize it a bit.
The server runs on a Core2Quad 9300, with 8GB RAM (max motherboard can take, 1U chassis) on an Intel motherboard with a 1TB SATA HDD.
dom0 is set to 512MB limit with a few small XEM VM's running:
root@zaxen01:[~]$ xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 512 4 r----- 96.5 actionco.vm 3 1519 1 -b---- 14.8 byracers.vm 4 511 1 -b---- 85.7 ns1 5 511 1 -b---- 22.3 picturestravel 6 255 1 -b---- 13.3 rafttheworld 7 255 1 -b---- 11.3 zafepres.vm 8 511 1 -b---- 19.0
the server itself seems to eat up a lot of resources:
root@zaxen01:[~]$ free -m total used free shared buffers cached Mem: 512 472 39 0 13 215 -/+ buffers/cache: 244 268 Swap: 4095 0 4095[/CODE]
Yet, it only has XEN, Webmin (since it's a CloudMin XEN server), Exim, Apache and a few other services running:
root@zaxen01:[~]$ chkconfig --list |grep "3:on" |awk '{print $1}' |sort acpid auditd crond csf dhcpd exim haldaemon httpd iptables iscsi iscsid kudzu lfd lvm2-monitor mdmonitor network qemu restorecond setroubleshoot smartd snmpd sshd syslog sysstat webmin xend xendomains
Is there anything I can optimize on such a server?
The server runs CentOS 5.5 x64:
root@zaxen01:[~]$ cat /etc/redhat-release CentOS release 5.5 (Final)
root@zaxen01:[~]$ uname -a Linux zaxen01.softdux.com 2.6.18-194.32.1.el5xen #1 SMP Wed Jan 5 18:44:24 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
with Xen version 3.1.2-194.32.1.el5
And there's the xm dmesg output:
Xen version 3.1.2-194.32.1.el5 (mockbuild@centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) Wed Jan 5 17:43:03 EST 2011 Latest ChangeSet: unavailable
(XEN) Command line: dom0_mem=512M (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000008f000 (usable) (XEN) 000000000008f000 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000cf53f000 (usable) (XEN) 00000000cf53f000 - 00000000cf54b000 (reserved) (XEN) 00000000cf54b000 - 00000000cf620000 (usable) (XEN) 00000000cf620000 - 00000000cf6e8000 (ACPI NVS) (XEN) 00000000cf6e8000 - 00000000cf6ec000 (usable) (XEN) 00000000cf6ec000 - 00000000cf6f1000 (ACPI data) (XEN) 00000000cf6f1000 - 00000000cf6f2000 (usable) (XEN) 00000000cf6f2000 - 00000000cf6ff000 (ACPI data) (XEN) 00000000cf6ff000 - 00000000cf700000 (usable) (XEN) 00000000cf700000 - 00000000d0000000 (reserved) (XEN) 00000000fff00000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000230000000 (usable) (XEN) System RAM: 8181MB (8378020kB) (XEN) Xen heap: 13MB (13720kB) (XEN) Domain heap initialised: DMA width 32 bits (XEN) Processor #0 7:7 APIC version 20 (XEN) Processor #2 7:7 APIC version 20 (XEN) Processor #1 7:7 APIC version 20 (XEN) Processor #3 7:7 APIC version 20 (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Detected 2485.797 MHz processor. (XEN) HVM: VMX enabled (XEN) VMX: MSR intercept bitmap enabled (XEN) I/O virtualisation disabled (XEN) CPU0: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 1/2 eip 90000 (XEN) CPU1: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 2/1 eip 90000 (XEN) CPU2: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 3/3 eip 90000 (XEN) CPU3: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Total of 4 processors activated. (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) Platform timer overflows in 2 jiffies. (XEN) Platform timer is 1.193MHz PIT (XEN) Brought up 4 CPUs (XEN) *** LOADING DOMAIN 0 *** (XEN) elf_parse_binary: phdr: paddr=0xffffffff80200000 memsz=0x2f4d70 (XEN) elf_parse_binary: phdr: paddr=0xffffffff804f4d80 memsz=0x14c510 (XEN) elf_parse_binary: phdr: paddr=0xffffffff80642000 memsz=0xc08 (XEN) elf_parse_binary: phdr: paddr=0xffffffff80644000 memsz=0x11be8c (XEN) elf_parse_binary: memory: 0xffffffff80200000 -> 0xffffffff8075fe8c (XEN) elf_xen_parse_note: GUEST_OS = "linux" (XEN) elf_xen_parse_note: GUEST_VERSION = "2.6" (XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0" (XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000 (XEN) elf_xen_parse_note: PADDR_OFFSET = 0xffffffff80000000 (XEN) elf_xen_parse_note: ENTRY = 0xffffffff80200000 (XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff80206000 (XEN) elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_k ernel" (XEN) elf_xen_parse_note: LOADER = "generic" (XEN) elf_xen_addr_calc_check: addresses: (XEN) virt_base = 0xffffffff80000000 (XEN) elf_paddr_offset = 0xffffffff80000000 (XEN) virt_offset = 0x0 (XEN) virt_kstart = 0xffffffff80200000 (XEN) virt_kend = 0xffffffff8075fe8c (XEN) virt_entry = 0xffffffff80200000 (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 64-bit, lsb, paddr 0xffffffff80200000 -> 0xffffffff8075fe8c (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000222000000->0000000224000000 (122880 pages to be allocated) (XEN) VIRTUAL MEMORY ARRANGEMENT:
2011/2/23 Rudi Ahlers Rudi@softdux.com:
Hi,
I have a problematic CentOS XEN server and hope someone could point me in the right direction to optimize it a bit.
(SNIP)
the server itself seems to eat up a lot of resources:
root@zaxen01:[~]$ free -m total used free shared buffers cached Mem: 512 472 39 0 13 215 -/+ buffers/cache: 244 268 Swap: 4095 0 4095[/CODE]
244MB RAM in use and 0MB swap...looks good to me.
Is there anything I can optimize on such a server?
It's hard to give any advices without further information about what the problem is.
Best regards Kenni
On Wed, Feb 23, 2011 at 1:37 AM, Kenni Lund kenni@kelu.dk wrote:
2011/2/23 Rudi Ahlers Rudi@softdux.com:
Hi,
I have a problematic CentOS XEN server and hope someone could point me in the right direction to optimize it a bit.
(SNIP)
the server itself seems to eat up a lot of resources:
root@zaxen01:[~]$ free -m total used free shared buffers cached Mem: 512 472 39 0 13 215 -/+ buffers/cache: 244 268 Swap: 4095 0 4095[/CODE]
244MB RAM in use and 0MB swap...looks good to me.
Well, I just send a tech over to reset the server since it was locked up. He couldn't login to the console, or SSH and had to reset the server.
Is there anything I can optimize on such a server?
It's hard to give any advices without further information about what the problem is.
Fair enough, what other info can I give you?
On 23/02/11 12:29, Rudi Ahlers wrote:
Hi,
I have a problematic CentOS XEN server and hope someone could point me in the right direction to optimize it a bit.
The server runs on a Core2Quad 9300, with 8GB RAM (max motherboard can take, 1U chassis) on an Intel motherboard with a 1TB SATA HDD.
dom0 is set to 512MB limit with a few small XEM VM's running:
root@zaxen01:[~]$ xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 512 4 r----- 96.5
dom0 is responsible for IO, so you would normally expect it have it spend more time in the CPU. You could try pinning it to its own CPU...
actionco.vm 3 1519 1 -b---- 14.8 byracers.vm 4 511 1 -b---- 85.7 ns1 5 511 1 -b---- 22.3 picturestravel 6 255 1 -b---- 13.3 rafttheworld 7 255 1 -b---- 11.3 zafepres.vm 8 511 1 -b---- 19.0
the server itself seems to eat up a lot of resources:
root@zaxen01:[~]$ free -m total used free shared buffers cached Mem: 512 472 39 0 13 215 -/+ buffers/cache: 244 268 Swap: 4095 0 4095[/CODE]
This looks normal. Remember, Linux uses a "memory full" model, so although it appears that there is 39MB of real memory available, there is actually 268, with most of that being used to cache filesystem data.
If you want to see how loaded a system is, with respect to memory pressure, try using 'vmstat' and look for how often it is swapping pages into and out of swap.
Also, have a look at the Xen users guide. It has some performance-enhancing tips that you should be aware of. In particular, realise that dom0 is a little special, particularly with regard to IO.
Is there anything I can optimize on such a server?
Not entirely sure what you need to optimise at this point. So far I see a reasonably normal-looking system (although, to be frank, I don't have a lot of experience with Xen at present).
Hope it helps, Cameron
Are they paravirt of HVM guests? qemu might have something to do with it if HVM guests are involved.
----- Original Message ----
From: Rudi Ahlers Rudi@SoftDux.com To: CentOS mailing list centos@centos.org Sent: Tue, 22 February, 2011 23:29:29 Subject: [CentOS] how to optimize CentOS XEN dom0?
Hi,
I have a problematic CentOS XEN server and hope someone could point me in the right direction to optimize it a bit.
The server runs on a Core2Quad 9300, with 8GB RAM (max motherboard can take, 1U chassis) on an Intel motherboard with a 1TB SATA HDD.
dom0 is set to 512MB limit with a few small XEM VM's running:
root@zaxen01:[~]$ xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 512 4 r----- 96.5 actionco.vm 3 1519 1 -b---- 14.8 byracers.vm 4 511 1 -b---- 85.7 ns1 5 511 1 -b---- 22.3 picturestravel 6 255 1 -b---- 13.3 rafttheworld 7 255 1 -b---- 11.3 zafepres.vm 8 511 1 -b---- 19.0
the server itself seems to eat up a lot of resources:
root@zaxen01:[~]$ free -m total used free shared buffers cached Mem: 512 472 39 0 13 215 -/+ buffers/cache: 244 268 Swap: 4095 0 4095[/CODE]
Yet, it only has XEN, Webmin (since it's a CloudMin XEN server), Exim, Apache and a few other services running:
root@zaxen01:[~]$ chkconfig --list |grep "3:on" |awk '{print $1}' |sort acpid auditd crond csf dhcpd exim haldaemon httpd iptables iscsi iscsid kudzu lfd lvm2-monitor mdmonitor network qemu restorecond setroubleshoot smartd snmpd sshd syslog sysstat webmin xend xendomains
Is there anything I can optimize on such a server?
The server runs CentOS 5.5 x64:
root@zaxen01:[~]$ cat /etc/redhat-release CentOS release 5.5 (Final)
root@zaxen01:[~]$ uname -a Linux zaxen01.softdux.com 2.6.18-194.32.1.el5xen #1 SMP Wed Jan 5 18:44:24 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
with Xen version 3.1.2-194.32.1.el5
And there's the xm dmesg output:
Xen version 3.1.2-194.32.1.el5 (mockbuild@centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) Wed Jan 5 17:43:03 EST 2011 Latest ChangeSet: unavailable
(XEN) Command line: dom0_mem=512M (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000008f000 (usable) (XEN) 000000000008f000 - 00000000000a0000 (reserved) (XEN) 00000000000e0000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000cf53f000 (usable) (XEN) 00000000cf53f000 - 00000000cf54b000 (reserved) (XEN) 00000000cf54b000 - 00000000cf620000 (usable) (XEN) 00000000cf620000 - 00000000cf6e8000 (ACPI NVS) (XEN) 00000000cf6e8000 - 00000000cf6ec000 (usable) (XEN) 00000000cf6ec000 - 00000000cf6f1000 (ACPI data) (XEN) 00000000cf6f1000 - 00000000cf6f2000 (usable) (XEN) 00000000cf6f2000 - 00000000cf6ff000 (ACPI data) (XEN) 00000000cf6ff000 - 00000000cf700000 (usable) (XEN) 00000000cf700000 - 00000000d0000000 (reserved) (XEN) 00000000fff00000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000230000000 (usable) (XEN) System RAM: 8181MB (8378020kB) (XEN) Xen heap: 13MB (13720kB) (XEN) Domain heap initialised: DMA width 32 bits (XEN) Processor #0 7:7 APIC version 20 (XEN) Processor #2 7:7 APIC version 20 (XEN) Processor #1 7:7 APIC version 20 (XEN) Processor #3 7:7 APIC version 20 (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Detected 2485.797 MHz processor. (XEN) HVM: VMX enabled (XEN) VMX: MSR intercept bitmap enabled (XEN) I/O virtualisation disabled (XEN) CPU0: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 1/2 eip 90000 (XEN) CPU1: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 2/1 eip 90000 (XEN) CPU2: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Booting processor 3/3 eip 90000 (XEN) CPU3: Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz stepping 07 (XEN) Total of 4 processors activated. (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) Platform timer overflows in 2 jiffies. (XEN) Platform timer is 1.193MHz PIT (XEN) Brought up 4 CPUs (XEN) *** LOADING DOMAIN 0 *** (XEN) elf_parse_binary: phdr: paddr=0xffffffff80200000 memsz=0x2f4d70 (XEN) elf_parse_binary: phdr: paddr=0xffffffff804f4d80 memsz=0x14c510 (XEN) elf_parse_binary: phdr: paddr=0xffffffff80642000 memsz=0xc08 (XEN) elf_parse_binary: phdr: paddr=0xffffffff80644000 memsz=0x11be8c (XEN) elf_parse_binary: memory: 0xffffffff80200000 -> 0xffffffff8075fe8c (XEN) elf_xen_parse_note: GUEST_OS = "linux" (XEN) elf_xen_parse_note: GUEST_VERSION = "2.6" (XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0" (XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000 (XEN) elf_xen_parse_note: PADDR_OFFSET = 0xffffffff80000000 (XEN) elf_xen_parse_note: ENTRY = 0xffffffff80200000 (XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff80206000 (XEN) elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_k k ernel" (XEN) elf_xen_parse_note: LOADER = "generic" (XEN) elf_xen_addr_calc_check: addresses: (XEN) virt_base = 0xffffffff80000000 (XEN) elf_paddr_offset = 0xffffffff80000000 (XEN) virt_offset = 0x0 (XEN) virt_kstart = 0xffffffff80200000 (XEN) virt_kend = 0xffffffff8075fe8c (XEN) virt_entry = 0xffffffff80200000 (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 64-bit, lsb, paddr 0xffffffff80200000 -> 0xffffffff8075fe8c (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000222000000->0000000224000000 (122880 pages to be allocated) (XEN) VIRTUAL MEMORY ARRANGEMENT:
-- Kind Regards Rudi Ahlers SoftDux
Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, Feb 23, 2011 at 1:41 AM, Ian Murray murrayie@yahoo.co.uk wrote:
Are they paravirt of HVM guests? qemu might have something to do with it if HVM guests are involved.
Uhm, I know that I should know this, but how do I tell from a quick glance? It's almost 2am in the morning here, and I'm a bit too tired to think straight right now. I've been reading up on a lot of forums and other google search results before I posted here.
The VM's were originally created with HyperVM, but then imported into CloudMin.
you should have a look at your I/O disk status.
try with iostat -dx 5 to see the disk utilization info over time. when it comes to slowdown on a virtual environment on a Desktop grade machine, i suspect disk I/O latency and bottleneck as a cause.
check that your disk is running at its optimal state. look at some indicators , such the the I/O utilization averages, server load averages hddtemp /dev/sda will check for heating ( under high load it might )
in any case , you still got plenty of ram to spend.
On Wed, Feb 23, 2011 at 1:46 AM, Rudi Ahlers Rudi@softdux.com wrote:
On Wed, Feb 23, 2011 at 1:41 AM, Ian Murray murrayie@yahoo.co.uk wrote:
Are they paravirt of HVM guests? qemu might have something to do with it if HVM guests are involved.
Uhm, I know that I should know this, but how do I tell from a quick glance? It's almost 2am in the morning here, and I'm a bit too tired to think straight right now. I've been reading up on a lot of forums and other google search results before I posted here.
The VM's were originally created with HyperVM, but then imported into CloudMin.
-- Kind Regards Rudi Ahlers SoftDux
Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, Feb 23, 2011 at 9:06 AM, yonatan pingle yonatan.pingle@gmail.com wrote:
you should have a look at your I/O disk status.
try with iostat -dx 5 to see the disk utilization info over time. when it comes to slowdown on a virtual environment on a Desktop grade machine, i suspect disk I/O latency and bottleneck as a cause.
Thanx, I don't know how to interpret the results (yet), but here's the current output:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 27.20 0.00 6.80 0.00 448.00 65.88 0.00 0.59 0.35 0.24 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 27.20 0.00 6.80 0.00 448.00 65.88 0.00 0.59 0.35 0.24 dm-0 0.00 0.00 0.00 27.80 0.00 222.40 8.00 0.01 0.35 0.09 0.24 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-7 0.00 0.00 0.00 0.40 0.00 6.40 16.00 0.00 0.00 0.00 0.00 dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-12 0.00 0.00 0.00 2.80 0.00 97.60 34.86 0.00 0.00 0.00 0.00 dm-13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-16 0.00 0.00 0.00 3.00 0.00 121.60 40.53 0.00 0.00 0.00 0.00 dm-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Although, most of those values change the whole time, as such:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 1.00 0.00 0.80 0.00 17.60 22.00 0.00 0.00 0.00 0.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 1.00 0.00 0.80 0.00 17.60 22.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 1.40 0.00 11.20 8.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-7 0.00 0.00 0.00 0.40 0.00 6.40 16.00 0.00 0.00 0.00 0.00 dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
check that your disk is running at its optimal state. look at some indicators , such the the I/O utilization averages, server load averages hddtemp /dev/sda will check for heating ( under high load it might )
in any case , you still got plenty of ram to spend.
The HDD temp is 52 degrees Celsius, and according to the SNMP stats, it's been on average between 48 & 54 for the past 3 months now. But, every other server that I just checked is between 16 degree & 38 degrees. This is the hottest machine which means I need to add some extra cooling to it. it's a 1U chassis that we "inherited" when we acquired another company a while ago, and it was (still is to some degree) to move everyone to a new server so we keep it alive for now.
root@zaxen01:[~]$ smartctl -l scttemp /dev/sda smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION === SCT Status Version: 3 SCT Version (vendor specific): 522 (0x020a) SCT Support Level: 1 Device State: Active (0) Current Temperature: 52 Celsius Power Cycle Min/Max Temperature: 50/53 Celsius Lifetime Min/Max Temperature: 22/60 Celsius Under/Over Temperature Limit Count: 0/594 SCT Temperature History Version: 2 Temperature Sampling Period: 1 minute Temperature Logging Interval: 1 minute Min/Max recommended Temperature: 0/ 0 Celsius Min/Max Temperature Limit: 0/ 0 Celsius Temperature History Size (Index): 128 (24)
Index Estimated Time Temperature Celsius 25 2011-02-23 08:33 52 ********************************* ... ..( 7 skipped). .. ********************************* 33 2011-02-23 08:41 52 ********************************* 34 2011-02-23 08:42 53 ********************************** 35 2011-02-23 08:43 52 ********************************* 36 2011-02-23 08:44 52 ********************************* 37 2011-02-23 08:45 52 ********************************* 38 2011-02-23 08:46 53 ********************************** 39 2011-02-23 08:47 52 ********************************* ... ..( 2 skipped). .. ********************************* 42 2011-02-23 08:50 52 ********************************* 43 2011-02-23 08:51 53 ********************************** 44 2011-02-23 08:52 52 ********************************* ... ..(107 skipped). .. ********************************* 24 2011-02-23 10:40 52 *********************************
On Feb 23, 2011, at 3:42 AM, Rudi Ahlers Rudi@SoftDux.com wrote:
On Wed, Feb 23, 2011 at 9:06 AM, yonatan pingle yonatan.pingle@gmail.com wrote:
you should have a look at your I/O disk status.
try with iostat -dx 5 to see the disk utilization info over time. when it comes to slowdown on a virtual environment on a Desktop grade machine, i suspect disk I/O latency and bottleneck as a cause.
Thanx, I don't know how to interpret the results (yet), but here's the current output:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
Knowing the columns helps here,
rrqm/s and wrqm/s, mean read/write requests merged a second, shows how well scheduler is merging contiguous io operations
r/s and w/s, read/write io operations a second
rsec/s and wsec/s, read/write sectors a second, I usually use the -k option so it displays as kilobytes a second
avgrq-sz, shows average request size in the unit of choice, here being sectors, I wish it'd separate reads from writes, but oh well
avgqu-sz, average amount of io operations waiting for service, smaller is better
await, average time an io operation waited on queue to be serviced in ms, again smaller is better
svctm, last time it took to service an io operation, how long the drive took to perform the operation from when it left queue to when a result was returned
%util, the estimated drive utilization based on svctm, await and avgqu-sz
For lockups though I'd look at dmesg and xen log, xmlog I think is the command.
The number one reason for lockups though is most likely memory contention between domUs and dom0.
What are you running in dom0? What are your memory reservations like?
-Ross
On Wed, Feb 23, 2011 at 4:33 PM, Ross Walker rswwalker@gmail.com wrote:
On Feb 23, 2011, at 3:42 AM, Rudi Ahlers Rudi@SoftDux.com wrote:
On Wed, Feb 23, 2011 at 9:06 AM, yonatan pingle yonatan.pingle@gmail.com wrote:
you should have a look at your I/O disk status.
try with iostat -dx 5 to see the disk utilization info over time. when it comes to slowdown on a virtual environment on a Desktop grade machine, i suspect disk I/O latency and bottleneck as a cause.
Thanx, I don't know how to interpret the results (yet), but here's the current output:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
Knowing the columns helps here,
rrqm/s and wrqm/s, mean read/write requests merged a second, shows how well scheduler is merging contiguous io operations
r/s and w/s, read/write io operations a second
rsec/s and wsec/s, read/write sectors a second, I usually use the -k option so it displays as kilobytes a second
avgrq-sz, shows average request size in the unit of choice, here being sectors, I wish it'd separate reads from writes, but oh well
avgqu-sz, average amount of io operations waiting for service, smaller is better
await, average time an io operation waited on queue to be serviced in ms, again smaller is better
svctm, last time it took to service an io operation, how long the drive took to perform the operation from when it left queue to when a result was returned
%util, the estimated drive utilization based on svctm, await and avgqu-sz
For lockups though I'd look at dmesg and xen log, xmlog I think is the command.
The number one reason for lockups though is most likely memory contention between domUs and dom0.
What are you running in dom0? What are your memory reservations like?
-Ross _______________________________________________
I see a lot of these errors in /var/log/messages shortly before it crashed:
Feb 22 15:27:14 zaxen01 kernel: HighMem: empty Feb 22 15:27:14 zaxen01 kernel: 918 pagecache pages Feb 22 15:27:14 zaxen01 kernel: Swap cache: add 2248198, delete 2248009, find 160685591/160898897, race 0+453 Feb 22 15:27:14 zaxen01 kernel: Free swap = 0kB Feb 22 15:27:14 zaxen01 kernel: Total swap = 4194296kB Feb 22 15:27:14 zaxen01 kernel: Free swap: 0kB Feb 22 15:27:14 zaxen01 kernel: 133120 pages of RAM Feb 22 15:27:14 zaxen01 kernel: 22818 reserved pages Feb 22 15:27:16 zaxen01 kernel: 105840 pages shared Feb 22 15:27:16 zaxen01 kernel: 189 pages swap cached Feb 22 15:27:17 zaxen01 kernel: Out of memory: Killed process 17464, UID 99, (sendmail). Feb 23 00:35:38 zaxen01 syslogd 1.4.1: restart. Feb 23 00:35:38 zaxen01 kernel: klogd 1.4.1, log source = /proc/kmsg started. Feb 23 00:35:38 zaxen01 kernel: Bootdata ok (command line is ro root=/dev/System/root rhgb quiet xencons=tty6) Feb 23 00:35:38 zaxen01 kernel: Linux version 2.6.18-194.32.1.el5xen (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Wed Jan 5 18:44:24 EST 2011 Feb 23 00:35:38 zaxen01 kernel: BIOS-provided physical RAM map: Feb 23 00:35:38 zaxen01 kernel: Xen: 0000000000000000 - 0000000020800000 (usable) Feb 23 00:35:38 zaxen01 kernel: DMI 2.4 present. Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled) Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled) Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1]) Feb 23 00:35:38 zaxen01 kernel: ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1]) Feb 23 00:35:38 zaxen01 kernel: ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) Feb 23 00:35:38 zaxen01 kernel: IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 Feb 23 00:35:38 zaxen01 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) Feb 23 00:35:38 zaxen01 kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Feb 23 00:35:38 zaxen01 kernel: Setting APIC routing to xen Feb 23 00:35:38 zaxen01 kernel: Using ACPI (MADT) for SMP configuration information Feb 23 00:35:38 zaxen01 kernel: Allocating PCI resources starting at d4000000 (gap: d0000000:2ff00000) Feb 23 00:35:38 zaxen01 kernel: Built 1 zonelists. Total pages: 133120 Feb 23 00:35:38 zaxen01 kernel: Kernel command line: ro root=/dev/System/root rhgb quiet xencons=tty6 Feb 23 00:35:38 zaxen01 kernel: Initializing CPU#0 Feb 23 00:35:38 zaxen01 kernel: PID hash table entries: 4096 (order: 12, 32768 bytes)
@Ross,
dom0 is a XEN host for CloudMin,so it runs Apache + Webmin, MySQL & Exim
The server runs on a Core2Quad 9300, with 8GB RAM (max motherboard can take, 1U chassis) on an Intel motherboard with a 1TB SATA HDD.
dom0 is set to 512MB limit with a few small XEM VM's running:
root@zaxen01:[~]$ xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 512 4 r----- 96.5 actionco.vm 3 1519 1 -b---- 14.8 byracers.vm 4 511 1 -b---- 85.7 ns1 5 511 1 -b---- 22.3 picturestravel 6 255 1 -b---- 13.3 rafttheworld 7 255 1 -b---- 11.3 zafepres.vm 8 511 1 -b---- 19.0
...
What are the actual symptoms you are seeing?
In general I found that tuning the disk scheduler and also the Xen guest scheduler to be helpful:
http://wiki.xensource.com/xenwiki/CreditScheduler http://www.cyberciti.biz/faq/linux-change-io-scheduler-for-harddisk/
Also, I always recommend building Xen servers to use SAS drives rather than SATA because SATA are half duplex while SAS is full duplex, meaning under higher or more random IO you will better throughput. In my experience I see almost double the performance when using SAS over SATA, but our environments are IO heavy and may not reflect the realities of your environment.
I would also suggest, running disk IO stats in the VMs simultaneously while running iostat or vmstat in Dom0 to get a good read for where bottlenecks really are. I actually prefer to use the simple postmark utility as it is relatively simple and avoids disk caching issues which skew your results.