Em 29-03-2016 03:46, Götz Reinicke - IT Koordinator escreveu: > Am 28.03.16 um 16:23 schrieb Marcelo Ricardo Leitner: >> Em 28-03-2016 06:27, Götz Reinicke escreveu: >>> Hi, >>> >>> may be someone has an idea: >>> >>> We have three supermicron servers with two 10Gb Ports each, connected >>> to a cisco switch stack 1Gb ports. All are on auto speed. >>> >>> I configured a LACP bond on both sides on all servers, first with >>> citrix xen server. >>> >>> On one server eth0 goes down from time to time … maybe within minutes, >>> someday it is up for some hours. >>> >>> Two server are fine; the bond is up for 24 days(!) now without any >>> problem. >>> >>> Recently I installed centos 7.2 on that server in question and - bam - >>> eth0 is going down from time to time … >>> >>> I checked patch cables, tried an other switch port channel, >>> reconfigured the ports, reinstalled the os. Same behavior. >>> >>> And: We got a replacement server. Same behavior …. :) >>> >>> Currently the cisco tech guys don’t see a problem on the switch (which >>> is up for 3 Years now with 10+ servers connected … no problem so far), >>> from the citrix side I don’t get much more hints. >>> >>> In the logs i just have a Nic Link is Down … Nic Link is Up. It is >>> always eth0. >>> >>> Question: >>> >>> Any idea ? One suggestion was Disable all power saving features in the >>> server bios. Did not do that yet. >>> >>> Is there any chance to set some sort of higher debug level for that >>> nic/kernel/whatever to get some server os side feedback why the port >>> goes down? >>> >>> Regards and thanks for any hint! . Götz >> >> If you are seeing NIC Link is Down as in: >> [710442.668059] e1000e: enp0s25 NIC Link is Down >> then the NIC lost its link and bond is just protecting you as you >> probably didn't have any downtime due to that. IOW bonding is not the >> issue. >> >> Which NIC do you have on those servers? > > > The mainbord is a supermicro X10DRI-T with Intel X540 Dual port 10GBase-T. Okay, it's probably using ixgbe driver then. You may consider testing a newer kernel and see how that goes out, before doing too much debugging. You can install v4.5 using one of ELrepo's kernels at http://elrepo.org/linux/kernel/el7/x86_64/RPMS/ http://elrepo.org/tiki/tiki-index.php There are some changes between 7.2 and that kernel that it's good to be tested. Or... enable ixgbe debug, module param debug=16, and send the dmesg log, specially the lines around the event.