On 01/24/2017 11:16 AM, Kevin Stange wrote: > On 01/24/2017 09:10 AM, Konrad Rzeszutek Wilk wrote: >> On Tue, Jan 24, 2017 at 09:29:39PM +0800, -=X.L.O.R.D=- wrote: >>> Kevin Stange, >>> It can be either kernel or update the NIC driver or firmware of the NIC >>> card. Hope that helps! >>> >>> Xlord >>> -----Original Message----- >>> From: CentOS-virt [mailto:centos-virt-bounces at centos.org] On Behalf Of Kevin >>> Stange >>> Sent: Tuesday, January 24, 2017 1:04 AM >>> To: centos-virt at centos.org >>> Subject: [CentOS-virt] NIC Stability Problems Under Xen 4.4 / CentOS 6 / >>> Linux 3.18 >>> > <snip> >>> >>> Has anyone experienced similar issues with this configuration, and if so, >>> does anyone have tips on how to resolve the issues? >> >> Honeslty I would email Intel and see if they can help. This looks like >> the NIC decides something is wrong, throws off an PCIe error and >> then resets itself. > > This happens for several different NICs. Is there a good contact at > Intel for this kind of thing, or should I just try to reach them through > their web site? > >> It could also be an error in the Linux stack which would "eat" an >> interrupt when migrating interrupts (which was fixed >> upstream, see below). Are you running irqbalance? Could you try >> turning it off? > > irqbalance is enabled on these servers. I'll try disabling it. I had stopped irqbalance yesterday afternoon, but had a hypervisor's NICs fail anyway in early morning this morning, so I'm pretty sure this is not the right tree to bark up. -- Kevin Stange Chief Technology Officer Steadfast | Managed Infrastructure, Datacenter and Cloud Services 800 S Wells, Suite 190 | Chicago, IL 60607 312.602.2689 X203 | Fax: 312.602.2688 kevin at steadfast.net | www.steadfast.net