[CentOS-virt] Guests pausing suddenly

Wed Feb 29 07:53:09 UTC 2012
Peter Hopfgartner <peter.hopfgartner at r3-gis.com>

We have a CentOS 6.2 server with KVM. That server hosts 2 virtual 
machines, both with Centos 6.2, too.

Regularly, one or both of the virtual machines pass to state "pause" 
without apparent reason.
On resume, I do get have messages, like the following in /var/log/messages.

Feb 28 21:50:45 achernar fcoemon: Failed to connect to lldpad
Feb 29 08:23:56 achernar kernel: sd 0:0:0:0: [sda] Unhandled error code
Feb 29 08:23:56 achernar kernel: sd 0:0:0:0: [sda] Result: 
hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Feb 29 08:23:56 achernar kernel: sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 
06 db 70 78 00 00 38 00
Feb 29 08:23:56 achernar kernel: end_request: I/O error, dev sda, sector 
115044472
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252047
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252048
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252049
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252050
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252051
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252052
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:56 achernar kernel: Buffer I/O error on device dm-0, 
logical block 14252053
Feb 29 08:23:56 achernar kernel: lost page write due to I/O error on dm-0
Feb 29 08:23:57 achernar fcoemon: error 111 Connection refused


I could not find any sensible message on the pysical host, neither in 
/var/log/messages nor in /var/log/libvirt.

We do have an almost identical server, same hardware, same software 
which does not have this problem.

How could I proceed to better diagnose the cause of the troubles?

Regards,

-- 
Peter Hopfgartner
web  : www.r3-gis.com