Hello,

I'm managing smaller private virtualization infrastructure, currently based on CentOS 7.x. In the past we were running mostly on Debian 7 (kernel 3.2.x) and CentOS 6.x without problems.

After we have upgraded to CentOS 7.x, I have experienced occasional physical host crashes when I did e.g. suspend or resume more virtual machines OR random virtual machine checkpoint was invalid and VM could not be resumed.

I did few intesive tests on same hardware with:
- CentOS 6.6 ... worked fine
- CentOS 7.1 with
 1. CentOS distribution kernel ... failed
 2. Binary RHEL 7.1 distr. kernel ... failed
 3. vanilla 3.10.80 kernel ... failed
(plus various firmware releases and BIOS configurations)

So far I could reliably run only CentOS 7.x with latest 4.0.5 kernel from ElRepo.

7.x kernel is based on 3.10.x, which failed for me as well. So I think there was some bug in KVM, which led to memory corruption. The result was either kernel oops or broken checkpoint and kernel oops occured later.

I have opened bug on Red Hat
https://bugzilla.redhat.com/show_bug.cgi?id=1231964
but since it's a private bug, I have created duplicate bug on CentOS
http://bugs.centos.org/view.php?id=8949

There is described how to reproduce the problem including stress test script.

I would appreciate if anybody can confirm it happens for him as well.

Best regards,
Vlastimil Holer