On 11/07/2017 03:12 PM, Nathan March wrote:
Since moving from 4.4 to 4.6, I've been seeing an increasing number of stability issues on our hypervisors. I'm not clear if there's a singular root cause here, or if I'm dealing with multiple bugs.
One of the more common ones I've seen, is a VM on shutdown will remain in the null state and a kernel bug is thrown:
xen001 log # xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 6144 24 r----- 6639.7
(null) 3 0 1 --pscd 36.3
[89920.839074] BUG: unable to handle kernel paging request at ffff88020ee9a000
<snip>
This is on xen 4.6.6-4.el6 with 4.9.58-29.el6.x86_64. I see these issues across a wide number of systems with from both Dell and Supermicro, although we run the same Intel x540 10gb nic's in each system with the same netapp nfs backend storage.
We don't use NFS and have not seen the exact same issue.
--Sarah