Hi all,
The PF_RING seems not to work smoothly in CentOS 5. Several day before, I patched the kernel 2.6.25.3 and installed the PF_RING-patched kernel in my CentOS5. Based on the PF_RING, I developed my program to capture the network packages. I wished it can work until the machine power is off. Unforturnately, no matter how to adjust my program, whole system can not run more than 48 hours. Finally, whole system crashed and syslogd said it found the following kernel output.
I have contacted CentOS people, but they said they had not met the same problem in their operating system before.
I googled the internet and found similar bug occurred in Ubuntu on the same hardware platform: Dell poweredge. However, the fixing has to patch current kernel source code.
https://bugs.launchpad.net/ubuntu/hardy/+source/linux/+bug/214814
Also, I don't know what kernel they used. I downloaded the kernel source codes from www.kernel.org and patched it with PF_RING. Then, I setup the pathed kernel to run CentOS 5.
I would like to know whether I can avoid the bug if I used the original CentOS kernel code, not the one from www.kernel.org.
Are there anyone who meet the similar trouble?
Thanks a lot.
------------------ part dump of /var/log/messages ----------------------
..... Jul 25 02:15:02 vega2008 kernel: [<c045bc58>] ? exit_mmap+0x93/0xc9 Jul 25 02:15:02 vega2008 kernel: [<c04214c2>] ? mmput+0x25/0x68 Jul 25 02:15:02 vega2008 kernel: [<c046e9c9>] ? flush_old_exec+0x4f8/0x777 Jul 25 02:15:02 vega2008 kernel: [<c046dfcf>] ? kernel_read+0x32/0x43 Jul 25 02:15:02 vega2008 kernel: [<c0490e60>] ? load_elf_binary+0x359/0x1152 Jul 25 02:15:02 vega2008 kernel: [<c045a6ee>] ? get_user_pages+0x2d5/0x35c Jul 25 02:15:02 vega2008 kernel: [<c04570d2>] ? page_address+0x78/0x98 Jul 25 02:15:02 vega2008 kernel: [<c045735a>] ? kmap_high+0x19/0x16b Jul 25 02:15:02 vega2008 kernel: [<c04570d2>] ? page_address+0x78/0x98 Jul 25 02:15:02 vega2008 kernel: [<c046dced>] ? copy_strings+0x169/0x173 Jul 25 02:15:02 vega2008 kernel: [<c046ddad>] ? search_binary_handler+0x8f/0x1af Jul 25 02:15:02 vega2008 kernel: [<c046efe7>] ? do_execve+0x133/0x194 Jul 25 02:15:02 vega2008 kernel: [<c04030d7>] ? sys_execve+0x2a/0x4a Jul 25 02:15:02 vega2008 kernel: [<c04047aa>] ? syscall_call+0x7/0xb Jul 25 02:15:02 vega2008 kernel: [<c0610000>] ? early_init_intel+0x0/0x3c Jul 25 02:15:02 vega2008 kernel: ======================= Jul 25 02:15:02 vega2008 kernel: BUG: soft lockup - CPU#7 stuck for 61s! [sshd:24188] Jul 25 02:15:02 vega2008 kernel: Jul 25 02:15:02 vega2008 kernel: Pid: 24188, comm: sshd Not tainted (2.6.25.3 #3) Jul 25 02:15:02 vega2008 kernel: EIP: 0060:[<c06154f0>] EFLAGS: 00200293 CPU: 7 Jul 25 02:15:02 vega2008 kernel: EIP is at _spin_lock+0xa/0x15 Jul 25 02:15:02 vega2008 kernel: EAX: c079349c EBX: f79ec580 ECX: ffffffff EDX: 00008381 Jul 25 02:15:02 vega2008 kernel: ESI: ffffffff EDI: f79ec580 EBP: f68a6580 ESP: f1824e50 Jul 25 02:15:02 vega2008 kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Jul 25 02:15:02 vega2008 kernel: CR0: 8005003b CR2: b7ebf978 CR3: 32996000 CR4: 000006f0 Jul 25 02:15:02 vega2008 kernel: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 Jul 25 02:15:02 vega2008 kernel: DR6: ffff0ff0 DR7: 00000400 Jul 25 02:15:02 vega2008 kernel: [<c04119c7>] ? native_flush_tlb_others+0x49/0x9b Jul 25 02:15:02 vega2008 kernel: [<c0411e65>] ? flush_tlb_mm+0x51/0x54 Jul 25 02:15:02 vega2008 kernel: [<c045bc58>] ? exit_mmap+0x93/0xc9 Jul 25 02:15:02 vega2008 kernel: [<c04214c2>] ? mmput+0x25/0x68 Jul 25 02:15:02 vega2008 kernel: [<c046e9c9>] ? flush_old_exec+0x4f8/0x777 Jul 25 02:15:02 vega2008 kernel: [<c046dfcf>] ? kernel_read+0x32/0x43 Jul 25 02:15:02 vega2008 kernel: [<c0490e60>] ? load_elf_binary+0x359/0x1152 Jul 25 02:15:02 vega2008 kernel: [<c045a6ee>] ? get_user_pages+0x2d5/0x35c Jul 25 02:15:02 vega2008 kernel: [<c04570d2>] ? page_address+0x78/0x98 Jul 25 02:15:02 vega2008 kernel: [<c045735a>] ? kmap_high+0x19/0x16b Jul 25 02:15:02 vega2008 kernel: [<c04570d2>] ? page_address+0x78/0x98 Jul 25 02:15:02 vega2008 kernel: [<c046dced>] ? copy_strings+0x169/0x173 Jul 25 02:15:02 vega2008 kernel: [<c046ddad>] ? search_binary_handler+0x8f/0x1af Jul 25 02:15:02 vega2008 kernel: [<c046efe7>] ? do_execve+0x133/0x194 Jul 25 02:15:02 vega2008 kernel: [<c04030d7>] ? sys_execve+0x2a/0x4a Jul 25 02:15:02 vega2008 kernel: [<c04047aa>] ? syscall_call+0x7/0xb Jul 25 02:15:02 vega2008 kernel: [<c0610000>] ? early_init_intel+0x0/0x3c Jul 25 02:15:02 vega2008 kernel: ======================= Jul 25 02:15:02 vega2008 kernel: BUG: soft lockup - CPU#1 stuck for 61s! [http_cap:12228] ............. -----------------------------------------------------------------------------------
Ian jonhson wrote:
Hi all,
The PF_RING seems not to work smoothly in CentOS 5. Several day before, I patched the kernel 2.6.25.3
....
CentOS 5 uses kernel 2.6.18-xx ... If this PF_RING thing requires a different kernel, I think a more accurate statement would be, PF_RING is not supported on CentOS.
Ian jonhson wrote:
I googled the internet and found similar bug occurred in Ubuntu on the same hardware platform: Dell poweredge.
...
PowerEdge is Dell's brand name for ALL their Server products. That Ubuntu bug was specific to the i450NX chipset, which was new in 1999 and obsoleted circa 2001, and supported dual Xeon Pentium II/III processors, around 300-800Mhz.. How old is your server??!?
CentOS 5 uses kernel 2.6.18-xx ... If this PF_RING thing requires a different kernel, I think a more accurate statement would be, . PF_RING is not supported on CentOS.
Indeed. I wonder whether there exists some difference between CentOS kernel and genral kernel from www.kernel.org? If so, maybe I should download the CentOS kernel source code and patch it to run PF_RING.
Any help?
PowerEdge is Dell's brand name for ALL their Server products. That Ubuntu bug was specific to the i450NX chipset, which was new in 1999 and obsoleted circa 2001, and supported dual Xeon Pentium II/III processors, around 300-800Mhz.. How old is your server??!?
My machine used to be installed PF_RING is exactly DELL PowerEdge 1950, with 8-core CPUs and 4G RAM, 80G disk. I don't think it is out of date.
on 7-27-2008 10:19 PM Ian jonhson spake the following:
Hi all,
The PF_RING seems not to work smoothly in CentOS 5. Several day before, I patched the kernel 2.6.25.3 and installed the PF_RING-patched kernel in my CentOS5. Based on the PF_RING, I developed my program to capture the network packages. I wished it can work until the machine power is off. Unforturnately, no matter how to adjust my program, whole system can not run more than 48 hours. Finally, whole system crashed and syslogd said it found the following kernel output.
PF_RING seems to be used for the newest version of ntop for faster packet capture and analysis. Is that what you are trying to accomplish? Or did I just get a bad google?
PF_RING seems to be used for the newest version of ntop for faster packet capture and analysis. Is that what you are trying to accomplish?
Yes. this is the reason why we turn back on PF_RING to patch the kernel. Generally, the libpcap can not meet our need to capture the network packets.
Or did I just get a bad google?
Sorry, what do you mean?