[CentOS-virt] NIC Stability Problems Under Xen 4.4 / CentOS 6 / Linux 3.18

Mon Jan 30 09:18:32 UTC 2017
Jinesh Choksi <jinesh.choksi at algomi.com>

>Are there other kernel options that might be useful to try?

pci=nomsi

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173/comments/13



On 27 January 2017 at 18:21, Kevin Stange <kevin at steadfast.net> wrote:

> On 01/27/2017 06:08 AM, Karel Hendrych wrote:
> > Have you tried to eliminate all power management features all over?
>
> I've been trying to find and disable all power management features but
> having relatively little luck with that solving the problems.  Stabbing
> the the dark I've tried different ACPI settings, including completely
> disabling it, disabling CPU frequency scaling, and setting pcie_aspm=off
> on the kernel command line.  Are there other kernel options that might
> be useful to try?
>
> > Are the devices connected to the same network infrastructure?
>
> There are two onboard NICs and two NICs on a dual-port card in each
> server.  All devices connect to a cisco switch pair in VSS and the links
> are paired in LACP.
>
> > There has to be something common.
>
> The NICs having issues are running a native VLAN, a tagged VLAN, iSCSI
> and NFS traffic, as well as some basic management stuff over SSH, and
> they are configured with an MTU of 9000 on the native VLAN.  It's a lot
> of features, but I can't really turn them off and then actually have
> enough load on the NICs to reproduce the issue.  Several of these
> servers were installed and being burned in for 3 months without ever
> having an issue, but suddenly collapsed when I tried to bring 20 or so
> real-world VMs up on them.
>
> The other NICs in the system that are connected don't exhibit issues and
> run only VM network interfaces.  They are also in LACP and running VLAN
> tags, but normal 1500 MTU.
>
> So far it seems to correlate with NICs on the expansion cards, but it's
> a coincidence that these cards are the ones with the storage and
> management traffic.  I'm trying to swap some of this load to the onboard
> NICs to see if the issues migrate over with it, or if they stay with the
> expansion cards.
>
> If the issue exists on both NIC types, then it rules out the specific
> NIC chipset as the culprit.  It could point to the driver, but upgrading
> it to a newer version did not help and actually appeared to make
> everything worse.  This issue might actually be more to do with the PCIe
> bridge than the NICs, but these are still different motherboards with
> different PCIe bridges (5520 vs C600) experiencing the same issues.
>
> > I've been using Intel NICs with Xen/CentOS for ages with no issues.
>
> I figured that must be so.  Everyone uses Intel NICs.  If this was a
> common issue, it would probably be causing a lot of people a lot of
> trouble.
>
> --
> Kevin Stange
> Chief Technology Officer
> Steadfast | Managed Infrastructure, Datacenter and Cloud Services
> 800 S Wells, Suite 190 | Chicago, IL 60607
> 312.602.2689 X203 | Fax: 312.602.2688
> kevin at steadfast.net | www.steadfast.net
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170130/21e9d5f4/attachment-0006.html>