Information: 5.4 kernel (2.6.18-164.el5).<div><br></div><div>I have a vmcore (from kdump), if the developers are interested, let me know a place to upload the vmcore file.</div><div><br></div><div>I used the crash command to do a backtrace.</div>
<div><br></div><div>I manage to get machines with later 5.4 and 5.5 to panic the same way. Broadcom or Intel NICs panic the same way.</div><div><br></div><div><div>This is an NFS client where the NFS server is restarting several times; NFSv3, mount it with defaults,noatime.</div>
<div>The client was busy writing things on NFS-mounted space while the NFS servers was restarting several times.</div></div><div>So far, if I mount it with udp option, I've not managed to panic the machines. </div><div>
The bad news is that NFSv4 is strictly TCP, if I were to go down that route.</div><div><br></div><div><div>From the backtrace, it seems the crash is TCP-related. I'll be trying couple Linux TCP settings changes.</div>
<div>It's a possibility that the issues are with TCP in general (not NFS).</div><div>I would like to enlist community's help in further understanding this and potential work-arounds with this TCP issues.</div><div>
<br></div></div><div><div>crash> sys</div><div> KERNEL: vmlinux</div><div> DUMPFILE: vmcore</div><div> CPUS: 4</div><div> DATE: Tue Apr 20 15:04:09 2010</div><div> UPTIME: 18:55:25</div><div>
LOAD AVERAGE: 0.13, 0.09, 0.03</div><div> TASKS: 340</div><div> RELEASE: 2.6.18-164.el5</div><div> VERSION: #1 SMP Thu Sep 3 03:28:30 EDT 2009</div><div> MACHINE: x86_64 (2660 Mhz)</div><div> MEMORY: 23.6 GB</div>
<div> PANIC: "Oops: 0000 [1] SMP " (check log for details)</div><div>crash> bt -a</div><div>PID: 0 TASK: ffffffff802ffae0 CPU: 0 COMMAND: "swapper"</div><div> #0 [ffffffff8043ef20] crash_nmi_callback at ffffffff8007a3bf</div>
<div> #1 [ffffffff8043ef40] do_nmi at ffffffff8006585a</div><div> #2 [ffffffff8043ef50] nmi at ffffffff80064ebf</div><div> [exception RIP: acpi_processor_idle+579]</div><div> RIP: ffffffff8019765e RSP: ffffffff803f1f48 RFLAGS: 00000093</div>
<div> RAX: 000000000073111a RBX: 000000000073111a RCX: 0000000000000808</div><div> RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000</div><div> RBP: ffff81063e480100 R8: ffffffff803f0000 R9: ffffffff804b5e2c</div>
<div> R10: 0000000000000046 R11: 0000000000000046 R12: 0000000000000000</div><div> R13: ffff81063e480000 R14: 0000000000000000 R15: 0000000000000000</div><div> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018</div>
<div>--- <exception stack> ---</div><div> #3 [ffffffff803f1f48] acpi_processor_idle at ffffffff8019765e</div><div> #4 [ffffffff803f1f90] cpu_idle at ffffffff8004939e</div><div>PID: 0 TASK: ffff810115f11100 CPU: 1 COMMAND: "swapper"</div>
<div> #0 [ffff810115f38f20] crash_nmi_callback at ffffffff8007a3bf</div><div> #1 [ffff810115f38f40] do_nmi at ffffffff8006585a</div><div> #2 [ffff810115f38f50] nmi at ffffffff80064ebf</div><div> [exception RIP: acpi_processor_idle+579]</div>
<div> RIP: ffffffff8019765e RSP: ffff810115f2fea8 RFLAGS: 00000093</div><div> RAX: 0000000000731145 RBX: 0000000000731145 RCX: 0000000000000808</div><div> RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000</div>
<div> RBP: ffff81063f173900 R8: ffff810115f2e000 R9: ffffffff804b5e2c</div><div> R10: 0000000000000046 R11: 0000000000000046 R12: 00000000000000ff</div><div> R13: ffff81063f173800 R14: 0000000000000100 R15: ffffffff803ea280</div>
<div> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018</div><div>--- <exception stack> ---</div><div> #3 [ffff810115f2fea8] acpi_processor_idle at ffffffff8019765e</div><div> #4 [ffff810115f2fef0] cpu_idle at ffffffff8004939e</div>
<div>PID: 0 TASK: ffff810115f20080 CPU: 2 COMMAND: "swapper"</div><div> #0 [ffff810115f6bbc0] crash_kexec at ffffffff800ac5b9</div><div> #1 [ffff810115f6bc80] __die at ffffffff80065127</div><div> #2 [ffff810115f6bcc0] do_page_fault at ffffffff80066da7</div>
<div> #3 [ffff810115f6bdb0] error_exit at ffffffff8005dde9</div><div> [exception RIP: pskb_copy+307]</div><div> RIP: ffffffff8022486b RSP: ffff810115f6be60 RFLAGS: 00010282</div><div> RAX: ffff81062cd5f540 RBX: ffff81062cac3980 RCX: ffff81046fb1e550</div>
<div> RDX: 0000000000000000 RSI: ffff81062cd5f550 RDI: 0000000000000004</div><div> RBP: ffff810466f54a80 R8: 00000000081f02b4 R9: 0000000000000000</div><div> R10: ffff81062cac3980 R11: 00000000000000c8 R12: 0000000000000220</div>
<div> R13: ffff810466f54a80 R14: 0000000000000002 R15: ffffffff803ea2a0</div><div> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018</div><div> #4 [ffff810115f6be78] tcp_transmit_skb at ffffffff800217b7</div><div> #5 [ffff810115f6bec8] tcp_retransmit_skb at ffffffff80250ccd</div>
<div> #6 [ffff810115f6bf08] tcp_write_timer at ffffffff80252652</div><div> #7 [ffff810115f6bf28] run_timer_softirq at ffffffff800968be</div><div> #8 [ffff810115f6bf58] __do_softirq at ffffffff8001235a</div><div> #9 [ffff810115f6bf88] call_softirq at ffffffff8005e2fc</div>
<div>#10 [ffff810115f6bfa0] do_softirq at ffffffff8006cb14</div><div>#11 [ffff810115f6bfb0] apic_timer_interrupt at ffffffff8005dc8e</div><div>--- <IRQ stack> ---</div><div>#12 [ffff810115f67df8] apic_timer_interrupt at ffffffff8005dc8e</div>
<div> [exception RIP: acpi_processor_idle+628]</div><div> RIP: ffffffff8019768f RSP: ffff810115f67ea8 RFLAGS: 00000282</div><div> RAX: ffff810115f67fd8 RBX: ffff81063f173100 RCX: 0000000080184973</div><div> RDX: ffff81063f173000 RSI: 0000000000000082 RDI: ffffffff804b5e2c</div>
<div> RBP: ffff810115f67ee8 R8: ffff810115f66000 R9: ffff810115f67ecc</div><div> R10: 0000000000000046 R11: ffff810115f67ee8 R12: ffff81063f6e1180</div><div> R13: 0000000010008040 R14: ffff81063f6e1180 R15: ffff81063f6e1180</div>
<div> ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018</div><div>#13 [ffff810115f67ea0] acpi_processor_idle at ffffffff80197685</div><div>#14 [ffff810115f67ef0] cpu_idle at ffffffff8004939e</div><div>PID: 0 TASK: ffff810115f94100 CPU: 3 COMMAND: "swapper"</div>
<div> #0 [ffff810115fbbf20] crash_nmi_callback at ffffffff8007a3bf</div><div> #1 [ffff810115fbbf40] do_nmi at ffffffff8006585a</div><div> #2 [ffff810115fbbf50] nmi at ffffffff80064ebf</div><div> [exception RIP: acpi_processor_idle+579]</div>
<div> RIP: ffffffff8019765e RSP: ffff810115fb9ea8 RFLAGS: 00000097</div><div> RAX: 0000000000731169 RBX: 0000000000731169 RCX: 0000000000000808</div><div> RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000</div>
<div> RBP: ffff81063f174900 R8: ffff810115fb8000 R9: ffff810115f942f0</div><div> R10: 0000000000000046 R11: 0000000000000046 R12: 00000000000000ff</div><div> R13: ffff81063f174800 R14: 0000000000000300 R15: ffffffff803ea2c0</div>
<div> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018</div><div>--- <exception stack> ---</div><div> #3 [ffff810115fb9ea8] acpi_processor_idle at ffffffff8019765e</div><div> #4 [ffff810115fb9ef0] cpu_idle at ffffffff8004939e</div>
<div>crash> quit</div><div><br></div></div>