I am getting the following kernel panic: 2.6.9-89.0.16.EL #1 Tue Nov 3 17:15:02 EST 2009 i686 i686 i386 GNU/Linux This system has ran fine for 2 years+.
Nov 17 10:09:21 kernel: Call Trace: Nov 17 10:09:21 kernel: [<c014c9d3>] wake_up_page+0x9/0x29 Nov 17 10:09:21 kernel: [<c015e1ad>] do_no_page+0x37d/0x3f4 Nov 17 10:09:21 kernel: [<c015e41b>] handle_mm_fault+0xec/0x212 Nov 17 10:09:21 kernel: [<c011decd>] do_page_fault+0x1ac/0x4dc Nov 17 10:09:21 kernel: [<c017a811>] copy_strings+0x22b/0x235 Nov 17 10:09:21 kernel: [<c017c3b0>] search_binary_handler+0xe4/0x1f2 Nov 17 10:09:21 kernel: [<c017c6aa>] do_execve+0x1ec/0x1f6 Nov 17 10:09:21 kernel: [<c011dd21>] do_page_fault+0x0/0x4dc Nov 17 10:09:21 kernel: [<c0323f9b>] error_code+0x2f/0x38
System has 1 GIG ram, SATA disk running in RAID-1. /proc/mdstat shows everything is fine on the array.
It started to panic last week. I upgrade from 4.4 to 4.8 and same thing is still happening.
Looking back further in the messages I see this also: Nov 17 10:09:21 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000108 Nov 17 10:09:21 kernel: printing eip: Nov 17 10:09:21 kernel: c014c9b8 Nov 17 10:09:21 kernel: *pde = 3a57a067 Nov 17 10:09:21 kernel: Oops: 0000 [#1] Nov 17 10:09:21 kernel: Modules linked in: md5 ipv6 autofs4 sunrpc ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables cpufreq_powersave dm_mirror dm_mod button battery ac uhci_hcd ehci_hcd snd_via82xx snd_ac97_ Nov 17 10:09:21 kernel: CPU: 0 Nov 17 10:09:21 kernel: EIP: 0060:[<c014c9b8>] Not tainted VLI Nov 17 10:09:21 kernel: EFLAGS: 00010206 (2.6.9-89.0.16.EL) Nov 17 10:09:21 kernel: EIP is at page_waitqueue+0x17/0x29 Nov 17 10:09:21 kernel: eax: 88200020 ebx: c1400020 ecx: 00000020 edx: 00000000 Nov 17 10:09:21 kernel: esi: c1400020 edi: 00000000 ebp: c1b34c00 esp: f0379e8c Nov 17 10:09:21 kernel: ds: 007b es: 007b ss: 0068 Nov 17 10:09:21 kernel: Process zcat (pid: 13339, threadinfo=f0379000 task=ecb6f1a0) Nov 17 10:09:21 kernel: Stack: c014c9d3 c1400020 fffd1a58 c015e1ad fffd4000 e0007000 62708f43 635c49ae Nov 17 10:09:21 kernel: c17279e0 00000000 e07a9a20 00296ae0 ee0ac9fc 00000001 c1b34c00 00296ae0 Nov 17 10:09:21 kernel: f03e4000 ee0ac9fc c015e41b 00000000 fffd1a58 f03e4000 00000000 c1b34c00
What might be happening here?
Jerry
On Nov 17, 2009, at 1:26 PM, "nate" centos@linuxpowered.net wrote:
Jerry Geis wrote:
Looking back further in the messages I see this also: Nov 17 10:09:21 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000108
Often means bad ram..
Or a bad driver.
Bad RAM would be much worse though because file systems are memory mapped and having bits flip undetected usually means there is some on disk corruption too.
Run memtest and if you find some bad RAM make sure to run fsck.
-Ross