--- Johnny Hughes mailing-lists@hughesjr.com wrote:
On Tue, April 12, 2005 3:08 pm, Bob Pierce said:
Hi all,
We are running a new Centos-4 server, and it has
kernel panicked on us 4
times in the last month. After the first kernel
panic we hooked up a
serial console to the server and captured the
output in order to have a
record of what happens. I've included the error
messages from the last
time it locked up... but it doesn't really mean
much to me. Anybody have
any ideas what might be causing this server lock
up?
Server description: -Dell PE1750 - dual 2.8Ghz Xeon (with Hyper
Threading on) - 2GB DDR RAM
- Perc4-DI onboard RAID using 3 scsi drives in
raid-5 configuration
-ext3 file system -kernel-smp-2.6.9-5.0.3.EL -mysql - from distribution -2 postfix instances rebuilt with MySQL support -amavisd-new -clamav -spamassassin -rbldnsd -bind
Here's the captured output from a serial console
connected to the server
at time of fault.
Unable to handle kernel NULL pointer dereference
at virtual address
00000000 printing eip: f8872da8 *pde = 35562001 Oops: 0000 [#1] SMP Modules linked in: md5 ipv6 autofs4 sunrpc dm_mod
button battery ac
ohci_hcd tg3 floppy sg ext3 jbd megaraid_mbox
megaraid_mm sd_mod
scsi_mod CPU: 1 EIP: 0060:[<f8872da8>] Not tainted VLI EFLAGS: 00010246 (2.6.9-5.0.3.ELsmp) EIP is at __journal_file_buffer+0x1b/0x221 [jbd] eax: 00000000 ebx: d2fff26c ecx: 00000008
edx: c2327680
esi: c2327680 edi: 00000008 ebp: 00000000
esp: f7533dd4
ds: 007b es: 007b ss: 0068 Process kjournald (pid: 210, threadinfo=f7533000
task=f75825b0)
Stack: 00000000 00000000 f148fad8 f7f66200
d2fff26c c2327680 f887351b
00000286 00000000 00000000 00000000 00000000
00000000 d2517e6c f7f66200
caa4c50c 00001f18 00000000 f75825b0 c011e8d2
f7533e44 f7533e44 f750c054
f8836f24 Call Trace: [<f887351b>]
journal_commit_transaction+0x310/0xfb1 [jbd]
[<c011e8d2>] autoremove_wake_function+0x0/0x2d [<f8836f24>] megaraid_isr+0x1ad/0x1bf
[megaraid_mbox]
[<c011e8d2>] autoremove_wake_function+0x0/0x2d [<c011bcd5>] finish_task_switch+0x30/0x66 [<c02c4363>] schedule+0x833/0x869 [<c0127e62>] del_timer_sync+0x7a/0x9c [<f8875e6d>] kjournald+0xc7/0x215 [jbd] [<c011e8d2>] autoremove_wake_function+0x0/0x2d [<c011e8d2>] autoremove_wake_function+0x0/0x2d [<c011bd1d>] schedule_tail+0x12/0x55 [<f8875da0>] commit_timeout+0x0/0x5 [jbd] [<f8875da6>] kjournald+0x0/0x215 [jbd] [<c01041f1>] kernel_thread_helper+0x5/0xb Code: 14 ba 01 00 00 00 83 c4 10 89 d0 5b 5e 5f 5d
c3 55 31 ed 57 89 cf
56 89 d6 53 53 53 89 c3 c7 44 24 04 00 00 00 00 8b
00 89 04 24 <8b> 00
a9 00 00 08 00 75 29 68 d4 85 87 f8 68 9b 07 00 00
68 55
No idea what is causing this (looks like a Filesystem process to me), but we have a new kernel (that will be included in CentOS-4.1). It is kernel-2.6.9-6.37.EL.src.rpm.
I would be glad to give you the new i686-smp kernel to see if it solves your problem.
Are these EM64T Xeons or i686(32-bit) Xeons:
http://www.intel.com/products/processor/xeon/index.htm
(looking at the Dell site, I think they are 32-bit)
(If I am wrong and it is the EM64T Xeons, you should have installed the x86_64 distro instead of the i386 one)
Also recommend the latest SCSI Controller BIOS:
http://support.dell.com/support/downloads/format.aspx?c=us&cs=04&l=e...
and Server BIOS:
http://support.dell.com/support/downloads/format.aspx?c=us&cs=04&l=e...
-- Johnny Hughes http://www.HughesJR.com/
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
wow, this looks and sounds like the same problems i was having with my box. But mine was once a day that it would lock up. i guess i am going to have to wait until 4.1 before i think about upgrading to centos 4*...
Steven
"On the side of the software box, in the 'System Requirements' section, it said 'Requires Windows or better'. So I installed Linux."
__________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/