[CentOS-virt] Crash in network stack under Xen

Mon Nov 13 07:52:45 UTC 2017
Pasi Kärkkäinen <pasik at iki.fi>

On Thu, Nov 09, 2017 at 01:36:44PM -0800, Sarah Newman wrote:
> Hi,
> 
> We had a potentially network related crash on a dom0 with Linux 4.9.39 / Xen 4.8 and as of today I can't find any fixes in stable/linux-4.9.y,
> xen/staging-4.8, or CPU microcode updates that look like a smoking gun. I can't rule out that it's Xen related. The backtraces are:
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:1473 inet_gro_complete+0xbb/0xd0
>

Did you try tweaking network settings, disabling GRO for the network interface in question, and see if that changes anything?



Thanks,

-- Pasi


> 
> It looks to me like in the first backtrace, this check from inet_gro_complete failed:
> 
> ops = rcu_dereference(inet_offloads[proto]);
> 
> Which I'm guessing means the packet didn't have a valid layer 4 protocol definition, or we don't have that protocol enabled. Then when attempting to
> handle that failure there was a GPF, I believe by accessing invalid data in shinfo->frag_list . "skb_release_data+0x73" is in __read_once_size, which
> I think is generated by "kfree_skb: if (likely(atomic_read(&skb->users) == 1))" .
> 
> --Sarah