Em 30-03-2016 06:46, Götz Reinicke - IT Koordinator escreveu: > Am 29.03.16 um 13:57 schrieb Marcelo Ricardo Leitner: >> Em 29-03-2016 03:46, Götz Reinicke - IT Koordinator escreveu: >>> Am 28.03.16 um 16:23 schrieb Marcelo Ricardo Leitner: >>>> Em 28-03-2016 06:27, Götz Reinicke escreveu: >>>>> Hi, >>>>> >>>>> may be someone has an idea: >>>>> >>>>> We have three supermicron servers with two 10Gb Ports each, connected >>>>> to a cisco switch stack 1Gb ports. All are on auto speed. >>>>> >>>>> I configured a LACP bond on both sides on all servers, first with >>>>> citrix xen server. >>>>> >>>>> On one server eth0 goes down from time to time … maybe within minutes, >>>>> someday it is up for some hours. >>>>> >>>>> Two server are fine; the bond is up for 24 days(!) now without any >>>>> problem. >>>>> >>>>> Recently I installed centos 7.2 on that server in question and - bam - >>>>> eth0 is going down from time to time … >>>>> >>>>> I checked patch cables, tried an other switch port channel, >>>>> reconfigured the ports, reinstalled the os. Same behavior. >>>>> >>>>> And: We got a replacement server. Same behavior …. :) >>>>> >>>>> Currently the cisco tech guys don’t see a problem on the switch (which >>>>> is up for 3 Years now with 10+ servers connected … no problem so far), >>>>> from the citrix side I don’t get much more hints. >>>>> >>>>> In the logs i just have a Nic Link is Down … Nic Link is Up. It is >>>>> always eth0. >>>>> >>>>> Question: >>>>> >>>>> Any idea ? One suggestion was Disable all power saving features in the >>>>> server bios. Did not do that yet. >>>>> >>>>> Is there any chance to set some sort of higher debug level for that >>>>> nic/kernel/whatever to get some server os side feedback why the port >>>>> goes down? >>>>> >>>>> Regards and thanks for any hint! . Götz >>>> >>>> If you are seeing NIC Link is Down as in: >>>> [710442.668059] e1000e: enp0s25 NIC Link is Down >>>> then the NIC lost its link and bond is just protecting you as you >>>> probably didn't have any downtime due to that. IOW bonding is not the >>>> issue. >>>> >>>> Which NIC do you have on those servers? >>> >>> >>> The mainbord is a supermicro X10DRI-T with Intel X540 Dual port >>> 10GBase-T. >> >> Okay, it's probably using ixgbe driver then. >> You may consider testing a newer kernel and see how that goes out, >> before doing too much debugging. >> You can install v4.5 using one of ELrepo's kernels at >> http://elrepo.org/linux/kernel/el7/x86_64/RPMS/ >> http://elrepo.org/tiki/tiki-index.php >> There are some changes between 7.2 and that kernel that it's good to be >> tested. >> >> Or... enable ixgbe debug, module param debug=16, and send the dmesg log, >> specially the lines around the event. > > Hm,, could you give me a hint, how to enable that (at runtime) for > centos 7.2? I cant figure that out. > > Would be nice. cheers . Götz Ah during runtime you can just use ethtool: # ethtool -s eth0 msglvl 0xffff when done, revert with: # ethtool -s eth0 msglvl 0x7 Marcelo