Information: 5.4 kernel (2.6.18-164.el5). I have a vmcore (from kdump), if the developers are interested, let me know a place to upload the vmcore file. I used the crash command to do a backtrace. I manage to get machines with later 5.4 and 5.5 to panic the same way. Broadcom or Intel NICs panic the same way. This is an NFS client where the NFS server is restarting several times; NFSv3, mount it with defaults,noatime. The client was busy writing things on NFS-mounted space while the NFS servers was restarting several times. So far, if I mount it with udp option, I've not managed to panic the machines. The bad news is that NFSv4 is strictly TCP, if I were to go down that route. >From the backtrace, it seems the crash is TCP-related. I'll be trying couple Linux TCP settings changes. It's a possibility that the issues are with TCP in general (not NFS). I would like to enlist community's help in further understanding this and potential work-arounds with this TCP issues. crash> sys KERNEL: vmlinux DUMPFILE: vmcore CPUS: 4 DATE: Tue Apr 20 15:04:09 2010 UPTIME: 18:55:25 LOAD AVERAGE: 0.13, 0.09, 0.03 TASKS: 340 RELEASE: 2.6.18-164.el5 VERSION: #1 SMP Thu Sep 3 03:28:30 EDT 2009 MACHINE: x86_64 (2660 Mhz) MEMORY: 23.6 GB PANIC: "Oops: 0000 [1] SMP " (check log for details) crash> bt -a PID: 0 TASK: ffffffff802ffae0 CPU: 0 COMMAND: "swapper" #0 [ffffffff8043ef20] crash_nmi_callback at ffffffff8007a3bf #1 [ffffffff8043ef40] do_nmi at ffffffff8006585a #2 [ffffffff8043ef50] nmi at ffffffff80064ebf [exception RIP: acpi_processor_idle+579] RIP: ffffffff8019765e RSP: ffffffff803f1f48 RFLAGS: 00000093 RAX: 000000000073111a RBX: 000000000073111a RCX: 0000000000000808 RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000 RBP: ffff81063e480100 R8: ffffffff803f0000 R9: ffffffff804b5e2c R10: 0000000000000046 R11: 0000000000000046 R12: 0000000000000000 R13: ffff81063e480000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <exception stack> --- #3 [ffffffff803f1f48] acpi_processor_idle at ffffffff8019765e #4 [ffffffff803f1f90] cpu_idle at ffffffff8004939e PID: 0 TASK: ffff810115f11100 CPU: 1 COMMAND: "swapper" #0 [ffff810115f38f20] crash_nmi_callback at ffffffff8007a3bf #1 [ffff810115f38f40] do_nmi at ffffffff8006585a #2 [ffff810115f38f50] nmi at ffffffff80064ebf [exception RIP: acpi_processor_idle+579] RIP: ffffffff8019765e RSP: ffff810115f2fea8 RFLAGS: 00000093 RAX: 0000000000731145 RBX: 0000000000731145 RCX: 0000000000000808 RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000 RBP: ffff81063f173900 R8: ffff810115f2e000 R9: ffffffff804b5e2c R10: 0000000000000046 R11: 0000000000000046 R12: 00000000000000ff R13: ffff81063f173800 R14: 0000000000000100 R15: ffffffff803ea280 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <exception stack> --- #3 [ffff810115f2fea8] acpi_processor_idle at ffffffff8019765e #4 [ffff810115f2fef0] cpu_idle at ffffffff8004939e PID: 0 TASK: ffff810115f20080 CPU: 2 COMMAND: "swapper" #0 [ffff810115f6bbc0] crash_kexec at ffffffff800ac5b9 #1 [ffff810115f6bc80] __die at ffffffff80065127 #2 [ffff810115f6bcc0] do_page_fault at ffffffff80066da7 #3 [ffff810115f6bdb0] error_exit at ffffffff8005dde9 [exception RIP: pskb_copy+307] RIP: ffffffff8022486b RSP: ffff810115f6be60 RFLAGS: 00010282 RAX: ffff81062cd5f540 RBX: ffff81062cac3980 RCX: ffff81046fb1e550 RDX: 0000000000000000 RSI: ffff81062cd5f550 RDI: 0000000000000004 RBP: ffff810466f54a80 R8: 00000000081f02b4 R9: 0000000000000000 R10: ffff81062cac3980 R11: 00000000000000c8 R12: 0000000000000220 R13: ffff810466f54a80 R14: 0000000000000002 R15: ffffffff803ea2a0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #4 [ffff810115f6be78] tcp_transmit_skb at ffffffff800217b7 #5 [ffff810115f6bec8] tcp_retransmit_skb at ffffffff80250ccd #6 [ffff810115f6bf08] tcp_write_timer at ffffffff80252652 #7 [ffff810115f6bf28] run_timer_softirq at ffffffff800968be #8 [ffff810115f6bf58] __do_softirq at ffffffff8001235a #9 [ffff810115f6bf88] call_softirq at ffffffff8005e2fc #10 [ffff810115f6bfa0] do_softirq at ffffffff8006cb14 #11 [ffff810115f6bfb0] apic_timer_interrupt at ffffffff8005dc8e --- <IRQ stack> --- #12 [ffff810115f67df8] apic_timer_interrupt at ffffffff8005dc8e [exception RIP: acpi_processor_idle+628] RIP: ffffffff8019768f RSP: ffff810115f67ea8 RFLAGS: 00000282 RAX: ffff810115f67fd8 RBX: ffff81063f173100 RCX: 0000000080184973 RDX: ffff81063f173000 RSI: 0000000000000082 RDI: ffffffff804b5e2c RBP: ffff810115f67ee8 R8: ffff810115f66000 R9: ffff810115f67ecc R10: 0000000000000046 R11: ffff810115f67ee8 R12: ffff81063f6e1180 R13: 0000000010008040 R14: ffff81063f6e1180 R15: ffff81063f6e1180 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #13 [ffff810115f67ea0] acpi_processor_idle at ffffffff80197685 #14 [ffff810115f67ef0] cpu_idle at ffffffff8004939e PID: 0 TASK: ffff810115f94100 CPU: 3 COMMAND: "swapper" #0 [ffff810115fbbf20] crash_nmi_callback at ffffffff8007a3bf #1 [ffff810115fbbf40] do_nmi at ffffffff8006585a #2 [ffff810115fbbf50] nmi at ffffffff80064ebf [exception RIP: acpi_processor_idle+579] RIP: ffffffff8019765e RSP: ffff810115fb9ea8 RFLAGS: 00000097 RAX: 0000000000731169 RBX: 0000000000731169 RCX: 0000000000000808 RDX: 0000000000000815 RSI: 0000000000000003 RDI: 0000000000000000 RBP: ffff81063f174900 R8: ffff810115fb8000 R9: ffff810115f942f0 R10: 0000000000000046 R11: 0000000000000046 R12: 00000000000000ff R13: ffff81063f174800 R14: 0000000000000300 R15: ffffffff803ea2c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <exception stack> --- #3 [ffff810115fb9ea8] acpi_processor_idle at ffffffff8019765e #4 [ffff810115fb9ef0] cpu_idle at ffffffff8004939e crash> quit -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20100423/020e973e/attachment-0004.html>