On Thu, Nov 09, 2017 at 01:36:44PM -0800, Sarah Newman wrote:
Hi,
We had a potentially network related crash on a dom0 with Linux 4.9.39 / Xen 4.8 and as of today I can't find any fixes in stable/linux-4.9.y, xen/staging-4.8, or CPU microcode updates that look like a smoking gun. I can't rule out that it's Xen related. The backtraces are:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:1473 inet_gro_complete+0xbb/0xd0
Did you try tweaking network settings, disabling GRO for the network interface in question, and see if that changes anything?
Thanks,
-- Pasi
It looks to me like in the first backtrace, this check from inet_gro_complete failed:
ops = rcu_dereference(inet_offloads[proto]);
Which I'm guessing means the packet didn't have a valid layer 4 protocol definition, or we don't have that protocol enabled. Then when attempting to handle that failure there was a GPF, I believe by accessing invalid data in shinfo->frag_list . "skb_release_data+0x73" is in __read_once_size, which I think is generated by "kfree_skb: if (likely(atomic_read(&skb->users) == 1))" .
--Sarah