Hi folks
I deploy a two Dell PowerEdge T300 to test Virtualization with kvm+drbd+heartbaet.
The KVM drbd and heartbeat work properly.
However, I have doubt!!
When the primary node has down, the secondary node start the VM that has original running on primary node... So, this required a full stop of hole system... This is not we wish here...
Is there something way to live migrate VM from primary node that was shutdown????
I have no idea how to make this stuff working...
Thanks for any help
I have been scratching my head on this for days. Xendomains services just doesn't want to start at boot it seems, so I don't get my auto-domU's up without "service xendomains start" and the all start.
chkconfig looks correct, I have checked xm dmesg, dmesg, turned off selinux and the only "clue" I have is that the xend.log startup looks different than a fairly similar machine and I don't quite understand what it might be saying. Is dom0 crashing and restarting at machine bootup?
I have only one domU in ../auto to keep this simpler, its name is "v22c54" and I have one other anomaly: smartd is also not starting on services boot up but apparently runs fine with a manual command.
=== xend.log boot up === [2009-11-30 08:40:53 xend 3466] INFO (SrvDaemon:283) Xend Daemon started
[2009-11-30 08:40:53 xend 3466] INFO (SrvDaemon:287) Xend changeset: unavailable.
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:228) XendDomainInfo.recreate({'paused': 0, 'cpu_time': 19493383087L, 'ssidref': 0, 'hvm': 0, 'shutdown_reason': 0, 'dying': 0, 'mem_kb': 1048576L, 'domid': 0, 'max_vcpu_id': 3, 'crashed': 0, 'running': 1, 'maxmem_kb': 17179869180L, 'shutdown': 0, 'online_vcpus': 4, 'handle': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'blocked': 0})
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] INFO (XendDomainInfo:240) Recreating domain 0, UUID 00000000-0000-0000-0000-000000000000.
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] WARNING (XendDomainInfo:262) No vm path in store for existing domain 0
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:992) Storing VM details: {'shadow_memory': '0', 'uuid': '00000000-0000-0000-0000-000000000000', 'on_reboot': 'restart', 'on_poweroff': 'destroy', 'name': 'Domain-0', 'xend/restart_count': '0', 'vcpus': '4', 'vcpu_avail': '15', 'memory': '1024', 'on_crash': 'restart', 'maxmem': '1024'}
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1027) Storing domain details: {'cpu/1/availability': 'online', 'cpu/3/availability': 'online', 'name': 'Domain-0', 'console/limit': '4194304', 'cpu/2/availability': 'online', 'vm': '/vm/00000000-0000-0000-0000-000000000000', 'domid': '0', 'cpu/0/availability': 'online', 'memory/target': '1048576'}
[2009-11-30 08:40:53 xend 3466] DEBUG (XendDomain:163) number of vcpus to use is 2
[2009-11-30 08:40:53 xend 3466] INFO (SrvServer:116) unix path=/var/lib/xend/xend-socket
[2009-11-30 08:40:53 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch
== after manual xendomains start ==
[root@localhost ~]# service xendomains start Restoring Xen domains: v22c54. Starting auto Xen domains: v22c54(skip)[done] [ OK ]
~~ xend.log cont'd from above point ~~
[2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:287) XendDomainInfo.restore(['domain', ['domid', '3'], ['uuid', 'a3199faf-edb4-42e5-bea1-01f2df77a47f'], ['vcpus', '1'], ['vcpu_avail', '1'], ['cpu_cap', '0'], ['cpu_weight', '256.0'], ['memory', '512'], ['shadow_memory', '0'], ['maxmem', '512'], ['bootloader', '/usr/bin/pygrub'], ['features'], ['name', 'v22c54'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['image', ['linux', ['ramdisk', '/var/lib/xen/boot_ramdisk.yFE7zn'], ['kernel', '/var/lib/xen/boot_kernel.bnNF6O'], ['args', 'ro root=/dev/vgcentos00/root']]], ['cpus', []], ['device', ['vif', ['backend', '0'], ['script', 'vif-bridge'], ['bridge', 'xenbr1'], ['mac', '00:16:36:41:76:ae']]], ['device', ['tap', ['backend', '0'], ['dev', 'xvda:disk'], ['uname', 'tap:aio:/var/lib/xen/images/vms/v22c54'], ['mode', 'w']]], ['device', ['vkbd', ['backend', '0']]], ['device', ['vfb', ['backend', '0'], ['type', 'vnc'], ['vncunused', '1'], ['xauthority', '/root/.Xauthority'], ['keymap', 'en-us']]], ['state', '-b----'], ['shutdown_reason', 'poweroff'], ['cpu_time', '0.008262668'], ['online_vcpus', '1'], ['up_time', '305.694555044'], ['start_time', '1259414461.79'], ['store_mfn', '1875035'], ['console_mfn', '2193022']]) [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:328) parseConfig: config is ['domain', ['domid', '3'], ['uuid', 'a3199faf-edb4-42e5-bea1-01f2df77a47f'], ['vcpus', '1'], ['vcpu_avail', '1'], ['cpu_cap', '0'], ['cpu_weight', '256.0'], ['memory', '512'], ['shadow_memory', '0'], ['maxmem', '512'], ['bootloader', '/usr/bin/pygrub'], ['features'], ['name', 'v22c54'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['image', ['linux', ['ramdisk', '/var/lib/xen/boot_ramdisk.yFE7zn'], ['kernel', '/var/lib/xen/boot_kernel.bnNF6O'], ['args', 'ro root=/dev/vgcentos00/root']]], ['cpus', []], ['device', ['vif', ['backend', '0'], ['script', 'vif-bridge'], ['bridge', 'xenbr1'], ['mac', '00:16:36:41:76:ae']]], ['device', ['tap', ['backend', '0'], ['dev', 'xvda:disk'], ['uname', 'tap:aio:/var/lib/xen/images/vms/v22c54'], ['mode', 'w']]], ['device', ['vkbd', ['backend', '0']]], ['device', ['vfb', ['backend', '0'], ['type', 'vnc'], ['vncunused', '1'], ['xauthority', '/root/.Xauthority'], ['keymap', 'en-us']]], ['state', '-b----'], ['shutdown_reason', 'poweroff'], ['cpu_time', '0.008262668'], ['online_vcpus', '1'], ['up_time', '305.694555044'], ['start_time', '1259414461.79'], ['store_mfn', '1875035'], ['console_mfn', '2193022']] [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:445) parseConfig: result is {'features': None, 'image': ['linux', ['ramdisk', '/var/lib/xen/boot_ramdisk.yFE7zn'], ['kernel', '/var/lib/xen/boot_kernel.bnNF6O'], ['args', 'ro root=/dev/vgcentos00/root']], 'cpus': [], 'vcpu_avail': 1, 'backend': [], 'uuid': 'a3199faf-edb4-42e5-bea1-01f2df77a47f', 'on_reboot': 'restart', 'cpu_weight': 256.0, 'memory': 512, 'cpu_cap': 0, 'localtime': None, 'timer_mode': None, 'start_time': 1259414461.79, 'on_poweroff': 'destroy', 'on_crash': 'restart', 'device': [('vif', ['vif', ['backend', '0'], ['script', 'vif-bridge'], ['bridge', 'xenbr1'], ['mac', '00:16:36:41:76:ae']]), ('tap', ['tap', ['backend', '0'], ['dev', 'xvda:disk'], ['uname', 'tap:aio:/var/lib/xen/images/vms/v22c54'], ['mode', 'w']]), ('vkbd', ['vkbd', ['backend', '0']]), ('vfb', ['vfb', ['backend', '0'], ['type', 'vnc'], ['vncunused', '1'], ['xauthority', '/root/.Xauthority'], ['keymap', 'en-us']])], 'bootloader': '/usr/bin/pygrub', 'maxmem': 512, 'shadow_memory': 0, 'name': 'v22c54', 'bootloader_args': None, 'vcpus': 1, 'cpu': None} [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1774) XendDomainInfo.construct: None [2009-11-30 11:17:15 xend 3466] DEBUG (balloon:145) Balloon: 7021072 KiB free; need 4096; done. [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:992) Storing VM details: {'shadow_memory': '0', 'uuid': 'a3199faf-edb4-42e5-bea1-01f2df77a47f', 'on_crash': 'restart', 'on_reboot': 'restart', 'start_time': '1259414461.79', 'on_poweroff': 'destroy', 'name': 'v22c54', 'xend/restart_count': '0', 'vcpus': '1', 'vcpu_avail': '1', 'memory': '512', 'bootloader': '/usr/bin/pygrub', 'image': "(linux (ramdisk /var/lib/xen/boot_ramdisk.yFE7zn) (kernel /var/lib/xen/boot_kernel.bnNF6O) (args 'ro root=/dev/vgcentos00/root'))", 'maxmem': '512'} [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:114) DevController: writing {'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vkbd/1/0'} to /local/domain/1/device/vkbd/0. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:116) DevController: writing {'frontend-id': '1', 'domain': 'v22c54', 'frontend': '/local/domain/1/device/vkbd/0', 'state': '1', 'online': '1'} to /local/domain/0/backend/vkbd/1/0. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:114) DevController: writing {'backend-id': '0', 'mac': '00:16:36:41:76:ae', 'handle': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/1/0'} to /local/domain/1/device/vif/0. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:116) DevController: writing {'bridge': 'xenbr1', 'domain': 'v22c54', 'handle': '0', 'script': '/etc/xen/scripts/vif-bridge', 'state': '1', 'frontend': '/local/domain/1/device/vif/0', 'mac': '00:16:36:41:76:ae', 'online': '1', 'frontend-id': '1'} to /local/domain/0/backend/vif/1/0. [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:633) Checking for duplicate for uname: /var/lib/xen/images/vms/v22c54 [tap:aio:/var/lib/xen/images/vms/v22c54], dev: xvda:disk, mode: w [2009-11-30 11:17:15 xend 3466] DEBUG (blkif:27) exception looking up device number for xvda:disk: [Errno 2] No such file or directory: '/dev/xvda:disk' [2009-11-30 11:17:15 xend 3466] DEBUG (blkif:27) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda' [2009-11-30 11:17:15 xend 3466] DEBUG (blkif:27) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda' [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:114) DevController: writing {'backend-id': '0', 'virtual-device': '51712', 'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/backend/tap/1/51712'} to /local/domain/1/device/vbd/51712. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:116) DevController: writing {'domain': 'v22c54', 'frontend': '/local/domain/1/device/vbd/51712', 'format': 'raw', 'dev': 'xvda', 'state': '1', 'params': 'aio:/var/lib/xen/images/vms/v22c54', 'mode': 'w', 'online': '1', 'frontend-id': '1', 'type': 'tap'} to /local/domain/0/backend/tap/1/51712. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:114) DevController: writing {'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vfb/1/0'} to /local/domain/1/device/vfb/0. [2009-11-30 11:17:15 xend 3466] DEBUG (DevController:116) DevController: writing {'vncunused': '1', 'domain': 'v22c54', 'frontend': '/local/domain/1/device/vfb/0', 'xauthority': '/root/.Xauthority', 'state': '1', 'keymap': 'en-us', 'online': '1', 'frontend-id': '1', 'type': 'vnc'} to /local/domain/0/backend/vfb/1/0. [2009-11-30 11:17:15 xend 3466] DEBUG (vfbif:70) No VNC passwd configured for vfb access [2009-11-30 11:17:15 xend 3466] DEBUG (vfbif:11) Spawn: ['/usr/lib64/xen/bin/qemu-dm', '-M', 'xenpv', '-d', '1', '-domain-name', 'v22c54', '-vnc', '127.0.0.1:0', '-vncunused', '-k', 'en-us'] [2009-11-30 11:17:15 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1027) Storing domain details: {'console/port': '2', 'name': 'v22c54', 'console/limit': '4194304', 'vm': '/vm/a3199faf-edb4-42e5-bea1-01f2df77a47f', 'domid': '1', 'cpu/0/availability': 'online', 'memory/target': '524288', 'store/port': '1'} [2009-11-30 11:17:15 xend 3466] DEBUG (XendCheckpoint:200) restore:shadow=0x0, _static_max=0x200, _static_min=0x200, [2009-11-30 11:17:15 xend 3466] DEBUG (balloon:145) Balloon: 7021064 KiB free; need 524288; done. [2009-11-30 11:17:15 xend 3466] DEBUG (XendCheckpoint:217) [xc_restore]: /usr/lib64/xen/bin/xc_restore 20 1 1 2 0 0 0 [2009-11-30 11:17:15 xend 3466] INFO (XendCheckpoint:353) xc_domain_restore start: p2m_size = 20800 [2009-11-30 11:17:15 xend 3466] INFO (XendCheckpoint:353) Reloading memory pages: 0% [2009-11-30 11:17:21 xend 3466] INFO (XendCheckpoint:353) Received all pages (0 races) [2009-11-30 11:17:21 xend 3466] INFO (XendCheckpoint:3100% [2009-11-30 11:17:21 xend 3466] INFO (XendCheckpoint:353) Memory reloaded (35912 pages) [2009-11-30 11:17:21 xend 3466] INFO (XendCheckpoint:353) Domain ready to be built. [2009-11-30 11:17:21 xend 3466] DEBUG (XendCheckpoint:324) store-mfn 1842329 [2009-11-30 11:17:21 xend 3466] DEBUG (XendCheckpoint:324) console-mfn 2117145 [2009-11-30 11:17:21 xend 3466] INFO (XendCheckpoint:353) Restore exit with rc=0 [2009-11-30 11:17:21 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:927) XendDomainInfo.completeRestore [2009-11-30 11:17:21 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1027) Storing domain details: {'console/ring-ref': '2117145', 'console/port': '2', 'name': 'v22c54', 'console/limit': '4194304', 'vm': '/vm/a3199faf-edb4-42e5-bea1-01f2df77a47f', 'domid': '1', 'cpu/0/availability': 'online', 'memory/target': '524288', 'store/ring-ref': '1842329', 'store/port': '1'} [2009-11-30 11:17:21 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:943) XendDomainInfo.completeRestore done [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices vif. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:164) Waiting for 0. [2009-11-30 11:17:21 xend.XendDomainInfo 3466] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:509) hotplugStatusCallback /local/domain/0/backend/vif/1/0/hotplug-status. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:523) hotplugStatusCallback 1. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices usb. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices vbd. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices irq. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices vkbd. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:164) Waiting for 0. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:509) hotplugStatusCallback /local/domain/0/backend/vkbd/1/0/hotplug-status. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:523) hotplugStatusCallback 1. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices vfb. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:164) Waiting for 0. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:509) hotplugStatusCallback /local/domain/0/backend/vfb/1/0/hotplug-status. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:523) hotplugStatusCallback 1. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices pci. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices ioports. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices tap. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:164) Waiting for 51712. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:509) hotplugStatusCallback /local/domain/0/backend/tap/1/51712/hotplug-status. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:523) hotplugStatusCallback 1. [2009-11-30 11:17:21 xend 3466] DEBUG (DevController:158) Waiting for devices vtpm.
On Nov 30, 2009, at 8:26 AM, Ben M. wrote:
I have been scratching my head on this for days. Xendomains services just doesn't want to start at boot it seems, so I don't get my auto-domU's up without "service xendomains start" and the all start.
chkconfig looks correct, I have checked xm dmesg, dmesg, turned off selinux and the only "clue" I have is that the xend.log startup looks different than a fairly similar machine and I don't quite understand what it might be saying. Is dom0 crashing and restarting at machine bootup?
I have only one domU in ../auto to keep this simpler, its name is "v22c54" and I have one other anomaly: smartd is also not starting on services boot up but apparently runs fine with a manual command.
I'm guessing you covered this ("chkconfig looks correct") but you didn't change to a different runlevel like 2 did you?
[root@xen1 ~]# chkconfig --list xendomains xendomains 0:off 1:off 2:off 3:on 4:on 5:on 6:off [root@xen1 ~]# grep :initdefault /etc/inittab id:3:initdefault: [root@xen1 ~]# runlevel N 3
Also, I'm not too familiar with it, but if you're not shutting your domains off before reboot there may be something awry with the save/restore functionality. Personally I have this disabled so I can't speak to whether it would create the symptom you have, but it might be something to try. I have:
[root@xen1 ~]# grep "^[^#]" /etc/sysconfig/xendomains XENDOMAINS_SYSRQ="" XENDOMAINS_USLEEP=100000 XENDOMAINS_CREATE_USLEEP=5000000 XENDOMAINS_MIGRATE="" XENDOMAINS_SAVE="" XENDOMAINS_SHUTDOWN="--halt --wait" XENDOMAINS_SHUTDOWN_ALL="--all --halt --wait" XENDOMAINS_RESTORE=false XENDOMAINS_AUTO=/etc/xen/auto XENDOMAINS_AUTO_ONLY=false XENDOMAINS_STOP_MAXWAIT=300
Eric
Thanks, you gave me some solid points to check that I hadn't fully and I think I know a little more.
My chkconfig run level 2 was on, but runlevel was at "N 3". I toggled off, rebooted, no difference. Toggled on and off for the rest of the checklist.
Everything checked out except for XENDOMAINS_RESTORE=true which is default. I set it to false, toggled the runlevel 2 for a couple of reboot checks. No joy, but ...
Oddly I am getting Saves, even though DESTROY is explicitly set in the vm's conf to all circumstances:
name = 'v22c54' uuid = 'a3199faf-edb4-42e5-bea1-01f2df77a47f' maxmem = 512 memory = 512 vcpus = 1 bootloader = '/usr/bin/pygrub' on_poweroff = 'destroy' on_reboot = 'destroy' on_crash = 'destroy' vfb = [ 'type=vnc,vncunused=1,keymap=en-us' ] # note selinux is off now, but the privileges are set correctly disk = [ 'tap:aio:/var/lib/xen/images/vms/v22c54,xvda,w' ] vif = [ 'mac=00:16:36:41:76:ae,bridge=xenbr1' ]
I then slapped it around a bit and another quirk appeared.
From a fresh boot, I then manually started xendomains service. v22c54 comes up. I did an xm shut and it reported it shut, nothing in the Save folder. However, check this out:
[root@river22 ~]# service xendomains start Starting auto Xen domains: v22c54[done] [ OK ] [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 24.7 v22c54 1 511 1 r----- 9.0
[root@river22 ~]# xm shutdown v22c54 (no echo)
(I then tried to bring it back up, it balks, its not there and I see a boot_kernel.random and a boot_ramdisk.random come up in /var/lib/xen)
[root@river22 ~]# xm create v22c54 Using config file "/etc/xen/v22c54". Error: VM name 'v22c54' already in use by domain 1
(it isn't there) [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.7 [root@river22 ~]# xm shutdown v22c54 Error: Domain 'v22c54' does not exist. Usage: xm shutdown <Domain> [-waRH]
Shutdown a domain. [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.9
I certainly like to know why things glitch and don't mind seeing this through a little further, but I am beginning to wonder if I should just backup the domU's and try a fresh installation.
Is it possible I am running into a naming convention on these domUs?
My first 3 chars help me determine on which host the virtual machine was originally created.
Eric Searcy wrote:
On Nov 30, 2009, at 8:26 AM, Ben M. wrote:
I have been scratching my head on this for days. Xendomains services just doesn't want to start at boot it seems, so I don't get my auto-domU's up without "service xendomains start" and the all start.
chkconfig looks correct, I have checked xm dmesg, dmesg, turned off selinux and the only "clue" I have is that the xend.log startup looks different than a fairly similar machine and I don't quite understand what it might be saying. Is dom0 crashing and restarting at machine bootup?
I have only one domU in ../auto to keep this simpler, its name is "v22c54" and I have one other anomaly: smartd is also not starting on services boot up but apparently runs fine with a manual command.
I'm guessing you covered this ("chkconfig looks correct") but you didn't change to a different runlevel like 2 did you?
[root@xen1 ~]# chkconfig --list xendomains xendomains 0:off 1:off 2:off 3:on 4:on 5:on 6:off [root@xen1 ~]# grep :initdefault /etc/inittab id:3:initdefault: [root@xen1 ~]# runlevel N 3
Also, I'm not too familiar with it, but if you're not shutting your domains off before reboot there may be something awry with the save/restore functionality. Personally I have this disabled so I can't speak to whether it would create the symptom you have, but it might be something to try. I have:
[root@xen1 ~]# grep "^[^#]" /etc/sysconfig/xendomains XENDOMAINS_SYSRQ="" XENDOMAINS_USLEEP=100000 XENDOMAINS_CREATE_USLEEP=5000000 XENDOMAINS_MIGRATE="" XENDOMAINS_SAVE="" XENDOMAINS_SHUTDOWN="--halt --wait" XENDOMAINS_SHUTDOWN_ALL="--all --halt --wait" XENDOMAINS_RESTORE=false XENDOMAINS_AUTO=/etc/xen/auto XENDOMAINS_AUTO_ONLY=false XENDOMAINS_STOP_MAXWAIT=300
Eric _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
----- "Ben M." centos@rivint.com wrote:
Thanks, you gave me some solid points to check that I hadn't fully and I think I know a little more.
My chkconfig run level 2 was on, but runlevel was at "N 3". I toggled
off, rebooted, no difference. Toggled on and off for the rest of the checklist.
Everything checked out except for XENDOMAINS_RESTORE=true which is default. I set it to false, toggled the runlevel 2 for a couple of reboot checks. No joy, but ...
Oddly I am getting Saves, even though DESTROY is explicitly set in the vm's conf to all circumstances:
Do you have anything lingering in /var/lib/xen/save?
On Mon, Nov 30, 2009 at 11:18 AM, Ben M. centos@rivint.com wrote:
Thanks, you gave me some solid points to check that I hadn't fully and I think I know a little more.
My chkconfig run level 2 was on, but runlevel was at "N 3". I toggled off, rebooted, no difference. Toggled on and off for the rest of the checklist.
(p.s. on next most recent post: it's not about the "list", it's that your email client contains a reference to the thread which other mail clients use to present the thread in some manner--a tree or such. "All other lists you have been on" would have acted the same way, you just may not have realized as we all have different email clients.)
I didn't go into detail about what I was trying to point out about the runlevels, which I think may have led you astray a bit. Being in runlevel 3 means it wouldn't matter whether xendomains is set to start when in 2. I only brought it up because by default xendomains doesn't start in 2, so *if* you were starting in 2 it wouldn't start then. As you're apparently running in 3 (the default), "toggling" the setting for 2 was a bit of a red herring.
Everything checked out except for XENDOMAINS_RESTORE=true which is default. I set it to false, toggled the runlevel 2 for a couple of reboot checks. No joy, but ...
Oddly I am getting Saves, even though DESTROY is explicitly set in the vm's conf to all circumstances:
"destroy" in this context is your setting for what happens when the domain stops *on its own accord*. You still get saves if you shut down the dom0 and the xendomains script goes around and saves all the running domains (assuming it is configured to do that).
name = 'v22c54' uuid = 'a3199faf-edb4-42e5-bea1-01f2df77a47f' maxmem = 512 memory = 512 vcpus = 1 bootloader = '/usr/bin/pygrub' on_poweroff = 'destroy' on_reboot = 'destroy' on_crash = 'destroy' vfb = [ 'type=vnc,vncunused=1,keymap=en-us' ] # note selinux is off now, but the privileges are set correctly disk = [ 'tap:aio:/var/lib/xen/images/vms/v22c54,xvda,w' ] vif = [ 'mac=00:16:36:41:76:ae,bridge=xenbr1' ]
I then slapped it around a bit and another quirk appeared.
From a fresh boot, I then manually started xendomains service. v22c54 comes up. I did an xm shut and it reported it shut, nothing in the Save folder. However, check this out:
[root@river22 ~]# service xendomains start Starting auto Xen domains: v22c54[done] [ OK ] [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 24.7 v22c54 1 511 1 r----- 9.0
[root@river22 ~]# xm shutdown v22c54 (no echo)
(I then tried to bring it back up, it balks, its not there and I see a boot_kernel.random and a boot_ramdisk.random come up in /var/lib/xen)
[root@river22 ~]# xm create v22c54 Using config file "/etc/xen/v22c54". Error: VM name 'v22c54' already in use by domain 1
(it isn't there) [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.7 [root@river22 ~]# xm shutdown v22c54 Error: Domain 'v22c54' does not exist. Usage: xm shutdown <Domain> [-waRH]
Shutdown a domain. [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.9
I certainly like to know why things glitch and don't mind seeing this through a little further, but I am beginning to wonder if I should just backup the domU's and try a fresh installation.
Is it possible I am running into a naming convention on these domUs?
My first 3 chars help me determine on which host the virtual machine was originally created.
Likely just a timing issue if that was the order you ran the commands in. xm shutdown tells the guest to shutdown, it doesn't instantly destroy it. This can take awhile dependent on what your guest needs to do. If xm create told you it was still in use, it probably was still shutting down. It then probably finished shutting down and was gone when you ran xm list. The only thing that would be alarming is if you ran xm list *first* and didn't see the domain and then ran xm create and it told you it was in use.
Typically if I need to hard-cycle the host (config file changes) I shut down a guest from the guest OS, watch xm list until it goes away, and then run xm create.
The other thing I meant to suggest in my first email would be modifying (make a backup first) the xendomains start() function to append `date` to some random file in /tmp or similar. That would help target whether the error is that the xendomains script isn't running (OS configuration issue) or that it is running but not starting the domain (something more to do with xen configuration).
Anyways, these are just some general "what steps would I take to investigate this error"; I disclaim any warranty from whether these might lead you on a wild goose chase if you run too far with them ;-)
I apologize about that list etiquette breach. Was completely unaware a thread string was attached somewhere. Never knew that. I will observe that courtesy.
I will go through yours and Christopher's points after I get some needed other work done. After that, I may just backup the domUs I developed and do a new install. I must have hosed something.
Eric Searcy wrote:
On Mon, Nov 30, 2009 at 11:18 AM, Ben M. centos@rivint.com wrote:
Thanks, you gave me some solid points to check that I hadn't fully and I think I know a little more.
My chkconfig run level 2 was on, but runlevel was at "N 3". I toggled off, rebooted, no difference. Toggled on and off for the rest of the checklist.
(p.s. on next most recent post: it's not about the "list", it's that your email client contains a reference to the thread which other mail clients use to present the thread in some manner--a tree or such. "All other lists you have been on" would have acted the same way, you just may not have realized as we all have different email clients.)
I didn't go into detail about what I was trying to point out about the runlevels, which I think may have led you astray a bit. Being in runlevel 3 means it wouldn't matter whether xendomains is set to start when in 2. I only brought it up because by default xendomains doesn't start in 2, so *if* you were starting in 2 it wouldn't start then. As you're apparently running in 3 (the default), "toggling" the setting for 2 was a bit of a red herring.
Everything checked out except for XENDOMAINS_RESTORE=true which is default. I set it to false, toggled the runlevel 2 for a couple of reboot checks. No joy, but ...
Oddly I am getting Saves, even though DESTROY is explicitly set in the vm's conf to all circumstances:
"destroy" in this context is your setting for what happens when the domain stops *on its own accord*. You still get saves if you shut down the dom0 and the xendomains script goes around and saves all the running domains (assuming it is configured to do that).
name = 'v22c54' uuid = 'a3199faf-edb4-42e5-bea1-01f2df77a47f' maxmem = 512 memory = 512 vcpus = 1 bootloader = '/usr/bin/pygrub' on_poweroff = 'destroy' on_reboot = 'destroy' on_crash = 'destroy' vfb = [ 'type=vnc,vncunused=1,keymap=en-us' ] # note selinux is off now, but the privileges are set correctly disk = [ 'tap:aio:/var/lib/xen/images/vms/v22c54,xvda,w' ] vif = [ 'mac=00:16:36:41:76:ae,bridge=xenbr1' ]
I then slapped it around a bit and another quirk appeared.
From a fresh boot, I then manually started xendomains service. v22c54 comes up. I did an xm shut and it reported it shut, nothing in the Save folder. However, check this out:
[root@river22 ~]# service xendomains start Starting auto Xen domains: v22c54[done] [ OK ] [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 24.7 v22c54 1 511 1 r----- 9.0
[root@river22 ~]# xm shutdown v22c54 (no echo)
(I then tried to bring it back up, it balks, its not there and I see a boot_kernel.random and a boot_ramdisk.random come up in /var/lib/xen)
[root@river22 ~]# xm create v22c54 Using config file "/etc/xen/v22c54". Error: VM name 'v22c54' already in use by domain 1
(it isn't there) [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.7 [root@river22 ~]# xm shutdown v22c54 Error: Domain 'v22c54' does not exist. Usage: xm shutdown <Domain> [-waRH]
Shutdown a domain. [root@river22 ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1024 2 r----- 29.9
I certainly like to know why things glitch and don't mind seeing this through a little further, but I am beginning to wonder if I should just backup the domU's and try a fresh installation.
Is it possible I am running into a naming convention on these domUs?
My first 3 chars help me determine on which host the virtual machine was originally created.
Likely just a timing issue if that was the order you ran the commands in. xm shutdown tells the guest to shutdown, it doesn't instantly destroy it. This can take awhile dependent on what your guest needs to do. If xm create told you it was still in use, it probably was still shutting down. It then probably finished shutting down and was gone when you ran xm list. The only thing that would be alarming is if you ran xm list *first* and didn't see the domain and then ran xm create and it told you it was in use.
Typically if I need to hard-cycle the host (config file changes) I shut down a guest from the guest OS, watch xm list until it goes away, and then run xm create.
The other thing I meant to suggest in my first email would be modifying (make a backup first) the xendomains start() function to append `date` to some random file in /tmp or similar. That would help target whether the error is that the xendomains script isn't running (OS configuration issue) or that it is running but not starting the domain (something more to do with xen configuration).
Anyways, these are just some general "what steps would I take to investigate this error"; I disclaim any warranty from whether these might lead you on a wild goose chase if you run too far with them ;-) _______________________________________________ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Ben M. wrote on Mon, 30 Nov 2009 14:18:29 -0500:
Oddly I am getting Saves,
Is it possible that you once used xm start for this domain? Your xm list at the end suggests you didn't, but, well ... AFAIK you get saves only if you added the vm to xen storage with xm start. If you didn't then it will shutdown self-contained without a save. But if you did then it will automatically start all VMs that were running on shutdown (and saved) and the auto symlink and this functionality will clash.
Kai
Is it possible that you once used xm start for this domain? Your xm list at the end suggests you didn't, but, well ...
Entirely possible. I think I may have issued that command by mistake while in a rush.
For a short about a week while I got an error, something akin to "Cannot find xen storage." That may coincide when I noticed this issue.
Is there a queue or a file I can purge?
Kai Schaetzl wrote:
Ben M. wrote on Mon, 30 Nov 2009 14:18:29 -0500:
Oddly I am getting Saves,
Is it possible that you once used xm start for this domain? Your xm list at the end suggests you didn't, but, well ... AFAIK you get saves only if you added the vm to xen storage with xm start. If you didn't then it will shutdown self-contained without a save. But if you did then it will automatically start all VMs that were running on shutdown (and saved) and the auto symlink and this functionality will clash.
Kai
Ben M. wrote on Mon, 30 Nov 2009 22:39:25 -0500:
Is there a queue or a file I can purge?
You can remove it from the xen store with xm delete. However, an xm list will show all domains that are in storage, running or not. As your last xm list didn't show it I would assume it's not in the store anymore (if it was). If you want to see if the xen store works ok then just add it with "xm start". After this xm create will give you an error until you remove it with xm delete. When you stop xendomains (and this domain is running) it should stop the domain, when you start it it should start the domain. You have to remove the symlink. If you go back to non-Xen store managed domains then I would first test if xendomains starts up fine without any auto symlinks.
BTW, ">" have to be at the *start* of the line to be recognized as quote markers.
Kai
xenstore appears to be broken too. I'm hosed and lost. Other services/items are acting up too, including smartd and hotplug. Going to backup dev'd domUs, reformat drives and reinstall base Centos Xen Virtualization.
[root@dom0 ~]# xenstored [root@dom0 ~]# FATAL: Failed to initialize dom0 state: Invalid argument full talloc report on 'null_context' (total 96 bytes in 3 blocks) struct domain contains 96 bytes in 2 blocks (ref 0) /local/domain/0 contains 16 bytes in 1 blocks (ref 0)
[root@dom0 ~]# xenstored [root@dom0 ~]# FATAL: Failed to initialize dom0 state: Invalid argument
Should be already running, did you check with ps? I get this error as well, when I try to run it while running.
Kai
Thanks for everyone's help here. I need to put a RAID on this anyhow and am just losing way to much time on this and can't resolve it. I am pretty sure I inadvertently hosed something by removing a service (or subsequent dependency) from dom0 or "playing" with xm and virsh commands too much. One of the great things about a Xen environment is its ability to recover from a disaster in minimal time.
Kai Schaetzl wrote:
[root@dom0 ~]# xenstored [root@dom0 ~]# FATAL: Failed to initialize dom0 state: Invalid argument
Should be already running, did you check with ps? I get this error as well, when I try to run it while running.
Kai
My apologies, I thought that clearing the subject line and body of all data would do that.
All other lists I have been on the past 25 years perform that way just fine. I will check the FAQ.
Kai Schaetzl wrote:
With or without scratching, please do not hit reply when you want to
send
a *new* message to the list! Use "new message"! Thanks,
Kai
CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
On Mon, 2009-11-30 at 13:53 -0200, Gilberto Nunes wrote:
Hi folks
I deploy a two Dell PowerEdge T300 to test Virtualization with kvm+drbd+heartbaet.
The KVM drbd and heartbeat work properly.
However, I have doubt!!
When the primary node has down, the secondary node start the VM that has original running on primary node... So, this required a full stop of hole system... This is not we wish here...
Is there something way to live migrate VM from primary node that was shutdown????
I have no idea how to make this stuff working...
Thanks for any help
Currently there is work being done on a project for Xen called Remus. I am not sure about KVM but Remus is still in development and although it has been merged into the xen-unstable repository, it isn't completely ready yet (although the developers are working very hard).
Basically it performs the first part of a live migration and if connection is lost, it will jump the virtual machine over to the secondary host.
It appears that Red Hat is including "high availability" for KVM in their Red Hat Enterprise Virtualization Manager for Servers.
Not sure if this is going to make it to CentOS, can someone confirm/deny?
On Mon, 2009-11-30 at 11:43 -0500, Tait Clarridge wrote:
On Mon, 2009-11-30 at 13:53 -0200, Gilberto Nunes wrote:
Hi folks
I deploy a two Dell PowerEdge T300 to test Virtualization with kvm+drbd+heartbaet.
The KVM drbd and heartbeat work properly.
However, I have doubt!!
When the primary node has down, the secondary node start the VM that has original running on primary node... So, this required a full stop of hole system... This is not we wish here...
Is there something way to live migrate VM from primary node that was shutdown????
I have no idea how to make this stuff working...
Thanks for any help
Currently there is work being done on a project for Xen called Remus. I am not sure about KVM but Remus is still in development and although it has been merged into the xen-unstable repository, it isn't completely ready yet (although the developers are working very hard).
Basically it performs the first part of a live migration and if connection is lost, it will jump the virtual machine over to the secondary host.
It appears that Red Hat is including "high availability" for KVM in their Red Hat Enterprise Virtualization Manager for Servers.
Not sure if this is going to make it to CentOS, can someone confirm/deny?
There are 3 different things being discussed in this thread so far :
1. Live Migration 2. Virtual Machine HA 3. Continuous Mirroring / Replication
(1) First of all Live Migration is not a HA solution, if your primary machine dies , there is no way to initiate a Live Migration anymore, as Live migration requires the home node to be still active, it can be used to migrate workloads away during maintenance slots etc , or to spread load, but not for HA.
(2) So what people typically configure with HeartBeat is indeed the restart of a virtual machine from the same "shared" storage device.
(3) Remus and Kemari are new kids in town they are going for RealTime Mirroring and therefore will implement real Virtual Machine HA. Remus is headed for Xen inclusion, while Kemari has just announced that they are also starting to work on a KVM port , but their current version was only Xen targeted.
http://virtualization.com/guest-posts/2009/11/15/remus-and-kemari-still-goin...
Hope that clarifies some stuff ..
CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt