[CentOS] kvm: vm root fs becomes ro

Tue Dec 3 00:25:27 UTC 2013
Nux! <nux at li.nux.ro>

On 02.12.2013 23:29, Paul Heinlein wrote:
> I've a the following happen a couple times now, and my internet
> searches are failing to locate an answer to the problem.
> 
> We've got a few servers that primarily house VMs using KVM. They've
> got E-3 cpus and 32 GB RAM, and they run stock CentOS 6.4, fully
> patched (not yet migrated to 6.5). The VM disk images are housed on an
> NFS server. None of the VMs is particularly resource-hungry. They run
> a variety of Linux distros: CentOS 5/6, Debian 6/7.
> 
> I'll start to see the VMs fail to write files to their local
> filesytems. No machine in the chain has rebooted or been updated in
> any significant way, but the root filesystem is off-limits. (This will
> happen on just one of our servers; the other VM platforms run without
> issue.)
> 
> In /var/log/messages, I'll see the following entry for each impacted 
> VM:
> 
> <date> <host> kernel: kvm: <pid>: cpu0 disabled perfctr wrmsr: 0xc1 
> data 0xabcd
> 
> In /var/log/libvirt/qemu/<vm-name>.log, I'll see
> 
> block I/O error in device 'drive-virtio-disk0': Stale file handle 
> (116)
> 
> Oddly, the underlying host might be running, say, five VMs, but only
> four of them will get the log messages, and show the read-only
> symptoms, while the fifth just keeps chugging along.

I think CentOS ext4 filesystems do remount read-only in cases where the 
underlying device has problems; if in your case your network has any 
timeouts or is maxed-out then it could explain the problem. To ignore 
this might prolly be unwise, but it can be done by specifying 
errors=continue in /etc/fstab.
I would do some network/throughput tests between your hosts though, 
check that all drives are fine, that have available space etc. Also 
check the logs, dmesg and so on.

-- 
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro