Am 29.03.16 um 13:57 schrieb Marcelo Ricardo Leitner: > Em 29-03-2016 03:46, Götz Reinicke - IT Koordinator escreveu: >> Am 28.03.16 um 16:23 schrieb Marcelo Ricardo Leitner: >>> Em 28-03-2016 06:27, Götz Reinicke escreveu: >>>> Hi, >>>> >>>> may be someone has an idea: >>>> >>>> We have three supermicron servers with two 10Gb Ports each, connected >>>> to a cisco switch stack 1Gb ports. All are on auto speed. >>>> >>>> I configured a LACP bond on both sides on all servers, first with >>>> citrix xen server. >>>> >>>> On one server eth0 goes down from time to time … maybe within minutes, >>>> someday it is up for some hours. >>>> >>>> Two server are fine; the bond is up for 24 days(!) now without any >>>> problem. >>>> >>>> Recently I installed centos 7.2 on that server in question and - bam - >>>> eth0 is going down from time to time … >>>> >>>> I checked patch cables, tried an other switch port channel, >>>> reconfigured the ports, reinstalled the os. Same behavior. >>>> >>>> And: We got a replacement server. Same behavior …. :) >>>> >>>> Currently the cisco tech guys don’t see a problem on the switch (which >>>> is up for 3 Years now with 10+ servers connected … no problem so far), >>>> from the citrix side I don’t get much more hints. >>>> >>>> In the logs i just have a Nic Link is Down … Nic Link is Up. It is >>>> always eth0. >>>> >>>> Question: >>>> >>>> Any idea ? One suggestion was Disable all power saving features in the >>>> server bios. Did not do that yet. >>>> >>>> Is there any chance to set some sort of higher debug level for that >>>> nic/kernel/whatever to get some server os side feedback why the port >>>> goes down? >>>> >>>> Regards and thanks for any hint! . Götz >>> >>> If you are seeing NIC Link is Down as in: >>> [710442.668059] e1000e: enp0s25 NIC Link is Down >>> then the NIC lost its link and bond is just protecting you as you >>> probably didn't have any downtime due to that. IOW bonding is not the >>> issue. >>> >>> Which NIC do you have on those servers? >> >> >> The mainbord is a supermicro X10DRI-T with Intel X540 Dual port >> 10GBase-T. > > Okay, it's probably using ixgbe driver then. > You may consider testing a newer kernel and see how that goes out, > before doing too much debugging. > You can install v4.5 using one of ELrepo's kernels at > http://elrepo.org/linux/kernel/el7/x86_64/RPMS/ > http://elrepo.org/tiki/tiki-index.php > There are some changes between 7.2 and that kernel that it's good to be > tested. > > Or... enable ixgbe debug, module param debug=16, and send the dmesg log, > specially the lines around the event. Hm,, could you give me a hint, how to enable that (at runtime) for centos 7.2? I cant figure that out. Would be nice. cheers . Götz