On 05.03.2013 13:49, Dan Kenigsberg wrote: > On Tue, Mar 05, 2013 at 12:32:31PM +0100, Patrick Hurrelmann wrote: >> On 05.03.2013 11:14, Dan Kenigsberg wrote: >> <snip> >>>>>> >>>>>> My version of vdsm as stated by Dreyou: >>>>>> v 4.10.0-0.46 (.15), builded from >>>>>> b59c8430b2a511bcea3bc1a954eee4ca1c0f4861 (branch ovirt-3.1) >>>>>> >>>>>> I can't see that Ia241b09c96fa16441ba9421f61a2f9a417f0d978 was merged to >>>>>> 3.1 Branch? >>>>>> >>>>>> I applied that patch locally and restarted vdsmd but this does not >>>>>> change anything. Supported cpu is still as low as Conroe instead of >>>>>> Nehalem. Or is there more to do than patching libvirtvm.py? >>>>> >>>>> What is libvirt's opinion about your cpu compatibility? >>>>> >>>>> virsh -r cpu-compare <(echo '<cpu match="minimum"><model>Nehalem</model><vendor>Intel</vendor></cpu>') >>>>> >>>>> If you do not get "Host CPU is a superset of CPU described in bla", then >>>>> the problem is within libvirt. >>>>> >>>>> Dan. >>>> >>>> Hi Dan, >>>> >>>> virsh -r cpu-compare <(echo '<cpu >>>> match="minimum"><model>Nehalem</model><vendor>Intel</vendor></cpu>') >>>> Host CPU is a superset of CPU described in /dev/fd/63 >>>> >>>> So libvirt obviously is fine. Something different would have surprised >>>> my as virsh capabilities seemed correct anyway. >>> >>> So maybe, just maybe, libvirt has changed their cpu_map, a map that >>> ovirt-3.1 had a bug reading. >>> >>> Would you care to apply http://gerrit.ovirt.org/5035 to see if this is >>> it? >>> >>> Dan. >> >> Hi Dan, >> >> success! Applying that patch made the cpu recognition work again. The >> cpu type in admin portal shows again as Nehalem. Output from getVdsCaps: >> >> cpuCores = 4 >> cpuFlags = fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge, >> mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2, >> ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc, >> arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc, >> aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,smx,est,tm2, >> ssse3,cx16,xtpr,pdcm,sse4_1,sse4_2,popcnt,lahf_lm,ida, >> dts,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem, >> model_Conroe,model_coreduo,model_core2duo,model_Penryn, >> model_n270 >> cpuModel = Intel(R) Xeon(R) CPU X3430 @ 2.40GHz >> cpuSockets = 1 >> cpuSpeed = 2393.769 >> >> >> I compared libvirt's cpu_map.xml on both Centos 6.3 and CentOS 6.4 and >> indeed they do differ in large portions. So this patch should probably >> be merged to 3.1 branch? I will contact Dreyou and request that this >> patch will also be included in his builds. I guess otherwise there will >> be quite some fallout after people start picking CentOS 6.4 for oVirt 3.1. >> >> Thanks again and best regards > > Thank you for reporting this issue and verifying its fix. > > I'm not completely sure that we should keep maintaining the ovirt-3.1 > branch upstream - but a build destined for el6.4 must have it. > > If you believe we should release a fix version for 3.1, please verify > that http://gerrit.ovirt.org/12723 has no ill effects. > > Dan. I did none additional tests and the new CentOS 6.4 host failed start or migrate any vm. It always boils down to: Thread-43::ERROR::2013-03-07 15:02:51,950::task::853::TaskManager.Task::(_setError) Task=`52a9f96f-3dfd-4bcf-8d7a-db14e650b4c1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 861, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2551, in getVolumeSize apparentsize = str(volume.Volume.getVSize(sdUUID, imgUUID, volUUID, bs=1)) File "/usr/share/vdsm/storage/volume.py", line 283, in getVSize return mysd.getVolumeClass().getVSize(mysd, imgUUID, volUUID, bs) File "/usr/share/vdsm/storage/blockVolume.py", line 101, in getVSize return int(int(lvm.getLV(sdobj.sdUUID, volUUID).size) / bs) File "/usr/share/vdsm/storage/lvm.py", line 772, in getLV lv = _lvminfo.getLv(vgName, lvName) File "/usr/share/vdsm/storage/lvm.py", line 567, in getLv lvs = self._reloadlvs(vgName) File "/usr/share/vdsm/storage/lvm.py", line 419, in _reloadlvs self._lvs.pop((vgName, lvName), None) File "/usr/lib64/python2.6/contextlib.py", line 34, in __exit__ self.gen.throw(type, value, traceback) File "/usr/share/vdsm/storage/misc.py", line 1219, in acquireContext yield self File "/usr/share/vdsm/storage/lvm.py", line 404, in _reloadlvs lv = makeLV(*fields) File "/usr/share/vdsm/storage/lvm.py", line 218, in makeLV attrs = _attr2NamedTuple(args[LV._fields.index("attr")], LV_ATTR_BITS, "LV_ATTR") File "/usr/share/vdsm/storage/lvm.py", line 188, in _attr2NamedTuple attrs = Attrs(*values) TypeError: __new__() takes exactly 9 arguments (10 given) and followed by: Thread-43::ERROR::2013-03-07 15:02:51,987::dispatcher::69::Storage.Dispatcher.Protect::(run) __new__() takes exactly 9 arguments (10 given) Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1164, in prepare raise self.error TypeError: __new__() takes exactly 9 arguments (10 given) Thread-43::DEBUG::2013-03-07 15:02:51,987::vm::580::vm.Vm::(_startUnderlyingVm) vmId=`7db86f12-8c57-4d2b-a853-a6fd6f7ee82d`::_ongoingCreations released Thread-43::ERROR::2013-03-07 15:02:51,987::vm::604::vm.Vm::(_startUnderlyingVm) vmId=`7db86f12-8c57-4d2b-a853-a6fd6f7ee82d`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 1289, in _run devices = self.buildConfDevices() File "/usr/share/vdsm/vm.py", line 431, in buildConfDevices self._normalizeVdsmImg(drv) File "/usr/share/vdsm/vm.py", line 358, in _normalizeVdsmImg drv['truesize'] = res['truesize'] KeyError: 'truesize' In webadmin the start and migrate operations fail with 'truesize'. I could find BZ#876958 which has the very same error. So I tried to apply patch http://gerrit.ovirt.org/9317. I had to apply it manually (guess patch would need a rebase for 3.1), but it works. I now can start new virtual machines successfully on a CentOS 6.4 / oVirt 3.1 host. Migration of vm from CentOS 6.3 hosts work, but not the other way around. Migration from 6.4 to 6.3 fails: Thread-1296::ERROR::2013-03-07 15:55:24,845::vm::176::vm.Vm::(_recover) vmId=`c978cbf8-6b4d-4d6f-9435-480d9fed31c4`::internal error Process exited while reading console log output: Supported machines are: pc RHEL 6.3.0 PC (alias of rhel6.3.0) rhel6.3.0 RHEL 6.3.0 PC (default) rhel6.2.0 RHEL 6.2.0 PC rhel6.1.0 RHEL 6.1.0 PC rhel6.0.0 RHEL 6.0.0 PC rhel5.5.0 RHEL 5.5.0 PC rhel5.4.4 RHEL 5.4.4 PC rhel5.4.0 RHEL 5.4.0 PC Thread-1296::ERROR::2013-03-07 15:55:24,988::vm::240::vm.Vm::(run) vmId=`c978cbf8-6b4d-4d6f-9435-480d9fed31c4`::Failed to migrate Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 223, in run self._startUnderlyingMigration() File "/usr/share/vdsm/libvirtvm.py", line 451, in _startUnderlyingMigration None, maxBandwidth) File "/usr/share/vdsm/libvirtvm.py", line 491, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1178, in migrateToURI2 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self) libvirtError: internal error Process exited while reading console log output: Supported machines are: pc RHEL 6.3.0 PC (alias of rhel6.3.0) rhel6.3.0 RHEL 6.3.0 PC (default) rhel6.2.0 RHEL 6.2.0 PC rhel6.1.0 RHEL 6.1.0 PC rhel6.0.0 RHEL 6.0.0 PC rhel5.5.0 RHEL 5.5.0 PC rhel5.4.4 RHEL 5.4.4 PC rhel5.4.0 RHEL 5.4.0 PC But I guess this is fine and migration from higher host version to a lower version is probably not supported, right? Regards Patrick -- Lobster LOGsuite GmbH, Münchner Straße 15a, D-82319 Starnberg HRB 178831, Amtsgericht München Geschäftsführer: Dr. Martin Fischer, Rolf Henrich