Hello fellow listers! I've got some errors starting to crop up on one of my CentOS5 boxes. Below is a transcript:
vi S 048C664F 2600 4176 2837 (NOTLB) f1c87b68 00000082 c30229e0 048c664f 00000e65 f1c87b18 00000007 f7bacaa0 f7d97aa0 048c72fc 00000e65 00000cad 00000003 f7bacbac c302a9e0 c042daae f7d20000 f1c87b70 00000286 ffffffff 00000000 00000000 00ed957a 00ed957a Call Trace: [<c042daae>] lock_timer_base+0x15/0x2f [<c0604dfc>] schedule_timeout+0x71/0x8c [<c042d1d3>] process_timeout+0x0/0x5 [<c04803a8>] do_select+0x371/0x3cb [<c048092b>] __pollwait+0x0/0xb2 [<c04202b1>] default_wake_function+0x0/0xc [<c04202b1>] default_wake_function+0x0/0xc [<c06046b9>] schedule+0x90d/0x9ba [<c0436066>] autoremove_wake_function+0xd/0x2d [<c042daae>] lock_timer_base+0x15/0x2f [<c042dbbf>] __mod_timer+0x99/0xa3 [<c04d7a49>] blk_plug_device+0x5e/0x85 [<f884e2e2>] make_request+0x520/0x52a [raid1] [<f8860679>] journal_stop+0x1b0/0x1ba [jbd] [<c041fa31>] enqueue_task+0x29/0x39 [<c041f8de>] task_rq_lock+0x31/0x58 [<c04202a7>] try_to_wake_up+0x371/0x37b [<c041ea84>] __wake_up_common+0x2f/0x53 [<c041f871>] __wake_up+0x2a/0x3d [<c04806ab>] core_sys_select+0x2a9/0x2ca [<c052e771>] n_tty_receive_buf+0xc5e/0xcab [<c041ea84>] __wake_up_common+0x2f/0x53 [<c041f871>] __wake_up+0x2a/0x3d [<c0529c5e>] tty_wakeup+0x44/0x48 [<c04361fd>] remove_wait_queue+0x16/0x25 [<c041f871>] __wake_up+0x2a/0x3d [<c0529b9d>] tty_ldisc_deref+0x50/0x5f [<c0480c72>] sys_select+0x9a/0x180 [<c0404eff>] syscall_call+0x7/0xb =======================
There are others with various app names besides vi including httpd, named, sftp-server, etc.. Is this an imminent hardware failure? Do I have kernel issues? I've checked the system with lm_sensors and temps are perfectly normal. Also, performance and operation seems to be fine. Even with these errors, my services are running without any hiccups. HELP! :-)
Tim Nelson Systems/Network Support Rockbochs Inc. (218)727-4332 x105
Tim Nelson wrote:
There are others with various app names besides vi including httpd, named, sftp-server, etc.. Is this an imminent hardware failure? Do I have kernel issues? I've checked the system with lm_sensors and temps are perfectly normal. Also, performance and operation seems to be fine. Even with these errors, my services are running without any hiccups. HELP! :-)
Would need to see the full error but it sounds like a kernel oops. For me at least the useful info would be at the top of the error which wasn't included in your email.
Worst case, configure your system with a serial console and capture the error using a terminal emulator on another machine plugged into your serial console.
nate