[CentOS-virt] NIC Stability Problems Under Xen 4.4 / CentOS 6 / Linux 3.18

Tue Jan 31 22:14:42 UTC 2017
Adi Pircalabu <adi at ddns.com.au>

On 31/01/17 21:00, Jinesh Choksi wrote:
> On 30 January 2017 at 22:17, Adi Pircalabu wrote:
> 
>     May I chip in here? In our environment we're randomly seeing:
> 
>     Jan 17 23:40:14 xen01 kernel: ixgbe 0000:04:00.1 eth6: Detected Tx
>     Unit Hang
> 
> 
> Someone in this thread: https://sourceforge.net/p/e1000/bugs/530/#2855 
>   reported that /"With these kernels I was only able to work around the 
> issue by disabling tx-checksumming offload with ethtool."/
> 
> However, that was reported for Kernels 4.2.6 / 4.2.8 / 4.4.8 and 4.4.10. 
> I just thought it could be something you could rule out and hence 
> mentioned it:
> 
> ethtool --offload eth6 rx off tx off
> 
> 
> Another thing to rule out in case its a regression with Intel NICs and TSO:
> 
> # tso => tcp-segmentation-offload
> # gso => generic-segmentation-offload
> # gro => generic-receive-offload
> # sg => scatter-gather
> # ufo => udp-fragmentation-offload (Cannot change)
> # lro => large-receive-offload (Cannot change)
> 
> ethtool -K eth6 tso off gso off gro off sg off

Nice, useful information. I've just disabled tx & rx checksumming on all 
the 10Gb interfaces on the affected servers, see how it goes. But as I 
said yesterday, in our environment it takes months to replicate.

Thanks,

Adi Pircalabu