Hi all,
Our kernel is 2.6.32-358.14.1.x86_64, recently dozens of them panicked,
since it's been OK for a long time and the problem emerged all of a sudden,
I'm not sure if an upgrade caused this problem. Here's what I got from
backtracing:
PID: 8136 TASK: ffff8803341aead0 CPU: 2 COMMAND: ""
#0 [ffff880028283610] panic at ffffffff815286b8
#1 [ffff880028283690] oops_end at ffffffff8152c8a2
#2 [ffff8800282836c0] no_context at ffffffff81046c1b
#3 [ffff880028283710] __bad_area_nosemaphore at ffffffff81046ea5
#4 [ffff880028283760] bad_area_nosemaphore at ffffffff81046f73
#5 [ffff880028283770] __do_page_fault at ffffffff810476d1
#6 [ffff880028283890] do_page_fault at ffffffff8152e7be
#7 [ffff8800282838c0] page_fault at ffffffff8152bb75
[exception RIP: tcp_fastretrans_alert+2754]
RIP: ffffffff814aed62 RSP: ffff880028283970 RFLAGS: 00010246
RAX: 0000000000000002 RBX: ffff88003d22c940 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000000
RBP: ffff8800282839b0 R8: 000000018033a9ac R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000d03 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#8 [ffff8800282839b8] tcp_ack at ffffffff814afb2c
#9 [ffff880028283a88] tcp_rcv_state_process at ffffffff814b1128
#10 [ffff880028283b18] tcp_v4_do_rcv at ffffffff814b94f0
#11 [ffff880028283bb8] tcp_v4_rcv at ffffffff814baf9a
#12 [ffff880028283c48] ip_local_deliver_finish at ffffffff8149648d
#13 [ffff880028283c78] ip_local_deliver at ffffffff81496718
#14 [ffff880028283ca8] ip_rcv_finish at ffffffff81495bbd
#15 [ffff880028283ce8] ip_rcv at ffffffff81496155
#16 [ffff880028283d28] __netif_receive_skb at ffffffff8145db5b
#17 [ffff880028283d88] netif_receive_skb at ffffffff814621b8
#18 [ffff880028283dc8] virtnet_poll at ffffffffa0130565 [virtio_net]
#19 [ffff880028283e68] net_rx_action at ffffffff81463193
#20 [ffff880028283ec8] __do_softirq at ffffffff81078c71
#21 [ffff880028283f38] call_softirq at ffffffff8100c1cc
#22 [ffff880028283f50] do_softirq at ffffffff8100de05
#23 [ffff880028283f70] irq_exit at ffffffff81078a55
#24 [ffff880028283f80] do_IRQ at ffffffff81532365
--- <IRQ stack> ---
#25 [ffff88001e851f58] ret_from_intr at ffffffff8100b9d3
RIP: 00007fa080e1a538 RSP: 00007fa0781ec960 RFLAGS: 00000206
RAX: 0000000000000001 RBX: 00007fa0781ec9a0 RCX: 000000000001ef8c
RDX: 0000000000001000 RSI: 0000000000000006 RDI: 00007fa07c093df8
RBP: ffffffff8100b9ce R8: 0000000000000006 R9: 0000000004000001
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fa0710a18f0 R14: 0000000000000120 R15: 0000000000001000
ORIG_RAX: ffffffffffffff8e CS: 0033 SS: 002b
disassemble tcp_fasteretrans_alert+2754 gives:
0xffffffff814aed62 <tcp_fastretrans_alert+2754>: sub
0x58(%rdi),%r8d
I know this kernel is a bit old, but since these kernels are in production
environment, I can't just upgrade them all to test if it's the problem of
the old version. So I need some advice on how to debug or a bug report.
Thanks.