I have a machine with a recent install of centos (2.6.32-220.2.1.el6.x86_64 kernel). It crashed 3 times this week, and sent emails like the one below complaining about kernel taint. I've gone back to a previous kernel to see if that helps, but otherwise I don't know how to investigate this. What should I do?
Here's stuff from the log and one of the emails:
log:********************************
# grep -e '12 10:16' -e '10 14:46' -e '9 15:44' /var/log/messages Jan 9 15:44:11 name kernel: ------------[ cut here ]------------ Jan 9 15:44:11 name kernel: WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted) Jan 9 15:44:11 name kernel: Hardware name: Precision WorkStation 490 Jan 9 15:44:11 name kernel: Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 ppdev parport_pc parport sg microcode dcdbas serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i5000_edac edac_core i5k_amb shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom firewire_ohci firewire_core crc_itu_t ahci pata_acpi ata_generic ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Jan 9 15:44:11 name kernel: Pid: 17, comm: ksoftirqd/3 Not tainted 2.6.32-220.2.1.el6.x86_64 #1 Jan 9 15:44:11 name kernel: Call Trace: Jan 9 15:44:11 name kernel: [<ffffffff81069997>] ? warn_slowpath_common+0x87/0xc0 Jan 9 15:44:11 name kernel: [<ffffffff810699ea>] ? warn_slowpath_null+0x1a/0x20 Jan 9 15:44:11 name kernel: [<ffffffff814eccc5>] ? thread_return+0x232/0x79d Jan 9 15:44:11 name kernel: [<ffffffff81071b35>] ? ksoftirqd+0xd5/0x110 Jan 9 15:44:11 name kernel: [<ffffffff81071a60>] ? ksoftirqd+0x0/0x110 Jan 9 15:44:11 name kernel: [<ffffffff810906a6>] ? kthread+0x96/0xa0 Jan 9 15:44:11 name kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Jan 9 15:44:11 name kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0 Jan 9 15:44:11 name kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Jan 9 15:44:11 name kernel: ---[ end trace ce943d42785388df ]--- Jan 9 15:44:12 name abrtd: Directory 'oops-2012-01-09-15:44:12-1647-0' creation detected Jan 9 15:44:12 name abrt-dump-oops: Reported 1 kernel oopses to Abrt Jan 9 15:44:12 name abrtd: Can't open file '/var/spool/abrt/oops-2012-01-09-15:44:12-1647-0/uid': No such file or directory Jan 9 15:44:20 name kernel: Bridge firewalling registered Jan 9 15:44:32 name abrtd: Sending an email... Jan 9 15:44:32 name abrtd: Email was sent to: root@localhost Jan 9 15:44:32 name abrtd: New dump directory /var/spool/abrt/oops-2012-01-09-15:44:12-1647-0, processing Jan 10 14:46:44 name kernel: ------------[ cut here ]------------ Jan 10 14:46:44 name kernel: WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted) Jan 10 14:46:44 name kernel: Hardware name: Precision WorkStation 490 Jan 10 14:46:44 name kernel: Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 ppdev parport_pc parport sg microcode dcdbas serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i5000_edac edac_core i5k_amb shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom firewire_ohci firewire_core crc_itu_t ahci pata_acpi ata_generic ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Jan 10 14:46:44 name kernel: Pid: 2304, comm: thunderbird-bin Not tainted 2.6.32-220.2.1.el6.x86_64 #1 Jan 10 14:46:44 name kernel: Call Trace: Jan 10 14:46:44 name kernel: [<ffffffff81069997>] ? warn_slowpath_common+0x87/0xc0 Jan 10 14:46:44 name kernel: [<ffffffff810699ea>] ? warn_slowpath_null+0x1a/0x20 Jan 10 14:46:44 name kernel: [<ffffffff814eccc5>] ? thread_return+0x232/0x79d Jan 10 14:46:44 name kernel: [<ffffffff81051dc6>] ? enqueue_task+0x66/0x80 Jan 10 14:46:44 name kernel: [<ffffffff814ee50d>] ? schedule_hrtimeout_range+0x13d/0x160 Jan 10 14:46:44 name kernel: [<ffffffff81090d76>] ? add_wait_queue+0x46/0x60 Jan 10 14:46:44 name kernel: [<ffffffff8118b895>] ? __pollwait+0x75/0xf0 Jan 10 14:46:44 name kernel: [<ffffffff8118b079>] ? poll_schedule_timeout+0x39/0x60 Jan 10 14:46:44 name kernel: [<ffffffff8118bdcb>] ? do_sys_poll+0x45b/0x520 Jan 10 14:46:44 name kernel: [<ffffffff8118b820>] ? __pollwait+0x0/0xf0 Jan 10 14:46:44 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 10 14:46:44 name kernel: [<ffffffff8141be4e>] ? sock_aio_write+0x15e/0x170 Jan 10 14:46:44 name kernel: [<ffffffff8141bcf0>] ? sock_aio_write+0x0/0x170 Jan 10 14:46:44 name kernel: [<ffffffff81175fbb>] ? do_sync_readv_writev+0xfb/0x140 Jan 10 14:46:44 name kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 10 14:46:44 name kernel: [<ffffffff81218def>] ? selinux_file_permission+0xbf/0x150 Jan 10 14:46:44 name kernel: [<ffffffff811770e2>] ? do_readv_writev+0x162/0x1f0 Jan 10 14:46:44 name kernel: [<ffffffff8120c1d6>] ? security_file_permission+0x16/0x20 Jan 10 14:46:44 name kernel: [<ffffffff8118c08c>] ? sys_poll+0x7c/0x110 Jan 10 14:46:44 name kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b Jan 10 14:46:44 name kernel: ---[ end trace f9f851f2bb55fcd3 ]--- Jan 10 14:46:45 name abrtd: Directory 'oops-2012-01-10-14:46:45-1649-0' creation detected Jan 10 14:46:45 name abrt-dump-oops: Reported 1 kernel oopses to Abrt Jan 10 14:46:45 name abrtd: Can't open file '/var/spool/abrt/oops-2012-01-10-14:46:45-1649-0/uid': No such file or directory Jan 10 14:46:50 name kernel: Bridge firewalling registered Jan 12 10:16:39 name kernel: ------------[ cut here ]------------ Jan 12 10:16:39 name kernel: WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted) Jan 12 10:16:39 name kernel: Hardware name: Precision WorkStation 490 Jan 12 10:16:39 name kernel: Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 ppdev parport_pc parport sg microcode dcdbas serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i5000_edac edac_core i5k_amb shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom firewire_ohci firewire_core crc_itu_t ahci pata_acpi ata_generic ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Jan 12 10:16:39 name kernel: Pid: 1734, comm: Xorg Not tainted 2.6.32-220.2.1.el6.x86_64 #1 Jan 12 10:16:39 name kernel: Call Trace: Jan 12 10:16:39 name kernel: [<ffffffff81069997>] ? warn_slowpath_common+0x87/0xc0 Jan 12 10:16:39 name kernel: [<ffffffff810699ea>] ? warn_slowpath_null+0x1a/0x20 Jan 12 10:16:39 name kernel: [<ffffffff814eccc5>] ? thread_return+0x232/0x79d Jan 12 10:16:39 name kernel: [<ffffffff810958e3>] ? __hrtimer_start_range_ns+0x1a3/0x460 Jan 12 10:16:39 name kernel: [<ffffffff81094dc1>] ? lock_hrtimer_base+0x31/0x60 Jan 12 10:16:39 name kernel: [<ffffffff814ee498>] ? schedule_hrtimeout_range+0xc8/0x160 Jan 12 10:16:39 name kernel: [<ffffffff81094b70>] ? hrtimer_wakeup+0x0/0x30 Jan 12 10:16:39 name kernel: [<ffffffff81095bd4>] ? hrtimer_start_range_ns+0x14/0x20 Jan 12 10:16:39 name kernel: [<ffffffff8118b079>] ? poll_schedule_timeout+0x39/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b6e8>] ? do_select+0x578/0x6b0 Jan 12 10:16:39 name kernel: [<ffffffff8104d80d>] ? check_preempt_curr+0x6d/0x90 Jan 12 10:16:39 name kernel: [<ffffffff8118b820>] ? __pollwait+0x0/0xf0 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118b910>] ? pollwake+0x0/0x60 Jan 12 10:16:39 name kernel: [<ffffffff8118c30a>] ? core_sys_select+0x18a/0x2c0 Jan 12 10:16:39 name kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 12 10:16:39 name kernel: [<ffffffff81277141>] ? __clear_user+0x21/0x70 Jan 12 10:16:39 name kernel: [<ffffffff811770e2>] ? do_readv_writev+0x162/0x1f0 Jan 12 10:16:39 name kernel: [<ffffffff81012b59>] ? read_tsc+0x9/0x20 Jan 12 10:16:39 name kernel: [<ffffffff8109b629>] ? ktime_get_ts+0xa9/0xe0 Jan 12 10:16:39 name kernel: [<ffffffff810d41b5>] ? __audit_syscall_exit+0x265/0x290 Jan 12 10:16:39 name kernel: [<ffffffff8118c697>] ? sys_select+0x47/0x110 Jan 12 10:16:39 name kernel: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b Jan 12 10:16:39 name kernel: ---[ end trace 9c5478b05024d9ed ]--- Jan 12 10:16:40 name abrtd: Directory 'oops-2012-01-12-10:16:40-1660-0' creation detected Jan 12 10:16:40 name abrt-dump-oops: Reported 1 kernel oopses to Abrt Jan 12 10:16:40 name abrtd: Can't open file '/var/spool/abrt/oops-2012-01-12-10:16:40-1660-0/uid': No such file or directory Jan 12 10:16:47 name kernel: Bridge firewalling registered
From email: **************************************
Duplicate check =====
Common information ===== package ----- kernel
architecture ----- x86_64
kernel ----- 2.6.32-220.2.1.el6.x86_64
Additional information ===== kernel_tainted_long ----- Taint on warning.
kernel_tainted ----- 512
backtrace ----- WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted) Hardware name: Precision WorkStation 490 Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 ppdev parport_pc parport sg microcode dcdbas serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i5000_edac edac_core i5k_amb shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom firewire_ohci firewire_core crc_itu_t ahci pata_acpi ata_generic ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Pid: 1734, comm: Xorg Not tainted 2.6.32-220.2.1.el6.x86_64 #1 Call Trace: [<ffffffff81069997>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff810699ea>] ? warn_slowpath_null+0x1a/0x20 [<ffffffff814eccc5>] ? thread_return+0x232/0x79d [<ffffffff810958e3>] ? __hrtimer_start_range_ns+0x1a3/0x460 [<ffffffff81094dc1>] ? lock_hrtimer_base+0x31/0x60 [<ffffffff814ee498>] ? schedule_hrtimeout_range+0xc8/0x160 [<ffffffff81094b70>] ? hrtimer_wakeup+0x0/0x30 [<ffffffff81095bd4>] ? hrtimer_start_range_ns+0x14/0x20 [<ffffffff8118b079>] ? poll_schedule_timeout+0x39/0x60 [<ffffffff8118b6e8>] ? do_select+0x578/0x6b0 [<ffffffff8104d80d>] ? check_preempt_curr+0x6d/0x90 [<ffffffff8118b820>] ? __pollwait+0x0/0xf0 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118b910>] ? pollwake+0x0/0x60 [<ffffffff8118c30a>] ? core_sys_select+0x18a/0x2c0 [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81277141>] ? __clear_user+0x21/0x70 [<ffffffff811770e2>] ? do_readv_writev+0x162/0x1f0 [<ffffffff81012b59>] ? read_tsc+0x9/0x20 [<ffffffff8109b629>] ? ktime_get_ts+0xa9/0xe0 [<ffffffff810d41b5>] ? __audit_syscall_exit+0x265/0x290 [<ffffffff8118c697>] ? sys_select+0x47/0x110 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
hostname ----- airy
component ----- kernel
reason ----- WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted)
cmdline ----- ro root=/dev/mapper/vg_iprctmp9-lv_root rd_LVM_LV=vg_iprctmp9/lv_root rd_LVM_LV=vg_iprctmp9/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us crashkernel=128M rhgb quiet
kernel_tainted_short ----- ---------W
analyzer ----- Kerneloops
time ----- 1326399400
os_release ----- CentOS release 6.2 (Final)
-- Q: Why should this email be 5 sentences or less? A: http://five.sentenc.es IPRC-help FAQ: http://auana1.soest.hawaii.edu/wiki/index.php/Faq
Am 14.01.2012 00:16, schrieb Thomas Burns:
I have a machine with a recent install of centos (2.6.32-220.2.1.el6.x86_64 kernel). It crashed 3 times this week, and sent emails like the one below complaining about kernel taint. I've gone back to a previous kernel to see if that helps, but otherwise I don't know how to investigate this. What should I do?
https://access.redhat.com/kb/docs/DOC-68014
Ignore it.
Here's stuff from the log and one of the emails:
log:********************************
# grep -e '12 10:16' -e '10 14:46' -e '9 15:44' /var/log/messages Jan 9 15:44:11 name kernel: ------------[ cut here ]------------ Jan 9 15:44:11 name kernel: WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted) Jan 9 15:44:11 name kernel: Hardware name: Precision WorkStation 490 Jan 9 15:44:11 name kernel: Modules linked in: fuse nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 ppdev parport_pc parport sg microcode dcdbas serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support tg3 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i5000_edac edac_core i5k_amb shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom firewire_ohci firewire_core crc_itu_t ahci pata_acpi ata_generic ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] Jan 9 15:44:11 name kernel: Pid: 17, comm: ksoftirqd/3 Not tainted 2.6.32-220.2.1.el6.x86_64 #1
Alexander
Thanks for the reply, Alexander.
On Fri, Jan 13, 2012 at 1:55 PM, Alexander Dalloz ad+lists@uni-x.org wrote:
Am 14.01.2012 00:16, schrieb Thomas Burns:
I have a machine with a recent install of centos (2.6.32-220.2.1.el6.x86_64 kernel). It crashed 3 times this week, and sent emails like the one below complaining about kernel taint. I've gone back to a previous kernel to see if that helps, but otherwise I don't know how to investigate this. What should I do?
https://access.redhat.com/kb/docs/DOC-68014
Ignore it.
When I follow that link, I get: "The resource you requested is available exclusively to Red Hat customers with an active Red Hat or JBoss subscription."
Should I ignore it because it is a known bug that will soon be fixed? Why all the secrecy?
Dave
On 14/01/12 11:41 AM, Thomas Burns wrote:
Thanks for the reply, Alexander.
On Fri, Jan 13, 2012 at 1:55 PM, Alexander Dalloz ad+lists@uni-x.org wrote:
Am 14.01.2012 00:16, schrieb Thomas Burns:
I have a machine with a recent install of centos (2.6.32-220.2.1.el6.x86_64 kernel). It crashed 3 times this week, and sent emails like the one below complaining about kernel taint. I've gone back to a previous kernel to see if that helps, but otherwise I don't know how to investigate this. What should I do?
https://access.redhat.com/kb/docs/DOC-68014
Ignore it.
When I follow that link, I get: "The resource you requested is available exclusively to Red Hat customers with an active Red Hat or JBoss subscription."
Should I ignore it because it is a known bug that will soon be fixed? Why all the secrecy?
Can't help with the secrecy question but the relevant text from the linked document is copied below. Looks like its already been removed from Fedora.
Cheers -pete
*** SNIPPED TEXT BELOW ***
Environment
-Red Hat Enterprise Linux 6 -kernel-2.6.32-220.2.1.el6
Resolution
-No action necessary. Red Hat may remove the harmless WARN_ON_ONCE() call from the kernel in a future kernel errata.
Root Cause -When a system encounters this issue it will only print the warning once. -There are no adverse effects on a system that encounters this warning. -This is resolved upstream by removing the WARN_ON_ONCE() from sched().
Are we sure this is the same problem?
On Fri, Jan 13, 2012 at 9:02 PM, Peter Brady pdbrady@ans.com.au wrote:
On Fri, Jan 13, 2012 at 1:55 PM, Alexander Dalloz ad+lists@uni-x.org wrote:
Am 14.01.2012 00:16, schrieb Thomas Burns:
don't know how to investigate this. What should I do?
https://access.redhat.com/kb/docs/DOC-68014
Ignore it.
When I follow that link, I get: "The resource you requested is available exclusively to Red Hat customers with an active Red Hat or JBoss subscription."
Should I ignore it because it is a known bug that will soon be fixed? Why all the secrecy?
Can't help with the secrecy question but the relevant text from the linked document is copied below. Looks like its already been removed from Fedora.
Cheers -pete
*** SNIPPED TEXT BELOW ***
Environment
-Red Hat Enterprise Linux 6 -kernel-2.6.32-220.2.1.el6
Resolution
-No action necessary. Red Hat may remove the harmless WARN_ON_ONCE() call from the kernel in a future kernel errata.
Root Cause -When a system encounters this issue it will only print the warning once. -There are no adverse effects on a system that encounters this warning. -This is resolved upstream by removing the WARN_ON_ONCE() from sched().
This sounds like a harmless extraneous warning message. I get an email sent to root, some stuff in the log, and then the system crashes. "No adverse effects"?
So ... this does not help me understand what is wrong and what I am supposed to do (apparently nothing?). What process did you go through to find this answer? I appreciate you doing my work for me, but I'd appreciate it even more if you gave me some hints how to figure this out myself next time.
mahalo, Dave