[CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

Wed Feb 18 02:34:37 UTC 2015
Jason Pyeron <jpyeron at pdinc.us>

> -----Original Message-----
> From: Chris Murphy
> Sent: Tuesday, February 17, 2015 20:48
> 
> On Tue, Feb 17, 2015 at 7:54 AM, Jason Pyeron wrote:
> >> I'd post the entire dmesg somewhere
> >
> > http://client.pdinc.us/panic-341e97c30b5a4cb774942bae32d3f163.log
> 
> At least part of the problem happens before this log starts.

Feb 15 23:41:19 thirteen-230 dhclient[1272]: DHCPREQUEST on br0 to 192.168.5.58 port 67 (xid=0x48d081b6)
Feb 15 23:41:19 thirteen-230 dhclient[1272]: DHCPACK from 192.168.5.58 (xid=0x48d081b6)
Feb 15 23:41:21 thirteen-230 dhclient[1272]: bound to 192.168.13.230 -- renewal in 8613 seconds.
Feb 16 02:04:54 thirteen-230 dhclient[1272]: DHCPREQUEST on br0 to 192.168.5.58 port 67 (xid=0x48d081b6)
Feb 16 02:04:54 thirteen-230 dhclient[1272]: DHCPACK from 192.168.5.58 (xid=0x48d081b6)
Feb 16 02:04:55 thirteen-230 dhclient[1272]: bound to 192.168.13.230 -- renewal in 8735 seconds.
Feb 16 02:46:09 thirteen-230 kernel: kvm: 1994: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffffd8f0
Feb 16 02:46:09 thirteen-230 kernel: kvm: 1994: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
Feb 16 03:53:39 thirteen-230 kernel: kvm: 2161: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffffd8f0
Feb 16 03:53:39 thirteen-230 kernel: kvm: 2161: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
Feb 16 04:30:30 thirteen-230 dhclient[1272]: DHCPREQUEST on br0 to 192.168.5.58 port 67 (xid=0x48d081b6)
Feb 16 04:30:30 thirteen-230 dhclient[1272]: DHCPACK from 192.168.5.58 (xid=0x48d081b6)
Feb 16 04:30:31 thirteen-230 dhclient[1272]: bound to 192.168.13.230 -- renewal in 9224 seconds.

> 
> >> What do you get for
> >> smartctl -x <dev>
> >
> > http://client.pdinc.us/smartctl-2000e86b62db27169cc9307358ebf10e.log
> 
> OK no smart extended test has been done, but also no pending bad or
> relocated sectors, and no phy event errors either. So the write (10)
> error seems isolated but it's still really suspicious, so I'd start
> replacing hardware.

Dell tech is enroute. New system board and disk controller.

> 
> 
> > I have replaced the drive (and reinstalled) already, the 
> panics still happen once ever 30-40 hours.
> 
> The only thing that suggests it might not be hardware are all the kvm
> related messages in the kp.

How so, each of the results I find say these are to be ignored.

> So if you've changed kernels, or VM
> configuration recently, then I'd revert. That's the limit of the most

No changes from install out of the box.

> likely software explanation. If there's no recent software changes,
> then it must be hardware.
> 

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-                                                               -
- Jason Pyeron                      PD Inc. http://www.pdinc.us -
- Principal Consultant              10 West 24th Street #100    -
- +1 (443) 269-1555 x333            Baltimore, Maryland 21218   -
-                                                               -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This message is copyright PD Inc, subject to license 20080407P00.