PowerEdge 2850 with PERC SCSI RAID controller. 16GB RAM Dual Intel(R) Xeon(TM) CPU 3.00GHz Using megaraid driver and xfs on 3x 300GB SCSI disks as /dev/sdb (total 572GB) (RAID-5)
Stock Centos 6.4 x64, everything updated. Kernel : 2.6.32-358.2.1.el6.x86_64 #1 SMP
My Load always shows :
top - 10:30:21 up 23:09, 1 users, load average: 0.99, 0.96, 0.79
yet there is no services what-so-ever doing anything.
Using PowerTOP I can see that xfsaild is causing second most wake-ups (after kernel core : hrtimer_start (tick_sched_timer) )
< Detailed C-state information is not P-states (frequencies) 3.00 Ghz 100.0% 1500 Mhz 0.0% 1125 Mhz 0.0% 750 Mhz 0.0% 375 Mhz 0.0%
Wakeups-from-idle per second : 125.1 interval: 10.0s no ACPI power usage estimate available
Top causes for wake-ups: 47.7% (119.6) <kernel core> : hrtimer_start (tick_sched_timer) 39.9% (100.0) xfsaild/sdb1 : xfsaild (process_timeout)
Again, using TOP I can see that xfsaild is stuck in D state. It never changes.
10050 root 20 0 0 0 0 D 0.0 0.0 0:01.17 xfsaild/sdb1
My only way to fix this, is either rebooting the machine, unmounting the volume and mounting it back online. I've been rsyncing data twice between this server and another one. The rsync process takes about 30m-1hour. After the rsync operation I see that the xfsaild is stuck in D State and my Load is near 1.00.
I'v had no problem what-so-ever on centos 6.3 or early versions on other servers.
Any thoughts ? Information, help would be much appreciated.
Thanks in advance.
Best regards,
Svavar
Svavar Örn Eysteinsson wrote:
PowerEdge 2850 with PERC SCSI RAID controller. 16GB RAM Dual Intel(R) Xeon(TM) CPU 3.00GHz Using megaraid driver and xfs on 3x 300GB SCSI disks as /dev/sdb (total 572GB) (RAID-5)
Stock Centos 6.4 x64, everything updated. Kernel : 2.6.32-358.2.1.el6.x86_64 #1 SMP
My Load always shows :
top - 10:30:21 up 23:09, 1 users, load average: 0.99, 0.96, 0.79
yet there is no services what-so-ever doing anything.
Using PowerTOP I can see that xfsaild is causing second most wake-ups (after kernel core : hrtimer_start (tick_sched_timer) )
Have a look at the thread starting at:
http://lists.centos.org/pipermail/centos/2013-March/132931.html
There is a fix in the latest centosplus kernel
James Pearson
Thank you James. I installed the latest centosplus kernel, and it fixed it.
Looks like this problem will be fixed in kernel-2.6.32-358.5.1.el6
https://bugzilla.redhat.com/show_bug.cgi?id=921958
Thanks allot.
Svavar
On 12.4.2013, at 11:05, James Pearson wrote:
Svavar Örn Eysteinsson wrote:
PowerEdge 2850 with PERC SCSI RAID controller. 16GB RAM Dual Intel(R) Xeon(TM) CPU 3.00GHz Using megaraid driver and xfs on 3x 300GB SCSI disks as /dev/sdb (total 572GB) (RAID-5)
Stock Centos 6.4 x64, everything updated. Kernel : 2.6.32-358.2.1.el6.x86_64 #1 SMP
My Load always shows :
top - 10:30:21 up 23:09, 1 users, load average: 0.99, 0.96, 0.79
yet there is no services what-so-ever doing anything.
Using PowerTOP I can see that xfsaild is causing second most wake-ups (after kernel core : hrtimer_start (tick_sched_timer) )
Have a look at the thread starting at:
http://lists.centos.org/pipermail/centos/2013-March/132931.html
There is a fix in the latest centosplus kernel
James Pearson _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
It's a kernel bug and can safely be ignored.
----- Original Message ----- | PowerEdge 2850 with PERC SCSI RAID controller. | 16GB RAM | Dual Intel(R) Xeon(TM) CPU 3.00GHz | Using megaraid driver and xfs on 3x 300GB SCSI disks as /dev/sdb | (total 572GB) (RAID-5) | | Stock Centos 6.4 x64, everything updated. | Kernel : 2.6.32-358.2.1.el6.x86_64 #1 SMP | | My Load always shows : | | top - 10:30:21 up 23:09, 1 users, load average: 0.99, 0.96, 0.79 | | yet there is no services what-so-ever doing anything. | | Using PowerTOP I can see that xfsaild is causing second most wake-ups | (after kernel core : hrtimer_start (tick_sched_timer) ) | | | < Detailed C-state information is not P-states (frequencies) | 3.00 Ghz 100.0% | 1500 Mhz 0.0% | 1125 Mhz 0.0% | 750 Mhz 0.0% | 375 Mhz 0.0% | | Wakeups-from-idle per second : 125.1 interval: 10.0s | no ACPI power usage estimate available | | Top causes for wake-ups: | 47.7% (119.6) <kernel core> : hrtimer_start (tick_sched_timer) | 39.9% (100.0) xfsaild/sdb1 : xfsaild (process_timeout) | | | Again, using TOP I can see that xfsaild is stuck in D state. It never | changes. | | | 10050 root 20 0 0 0 0 D 0.0 0.0 0:01.17 | xfsaild/sdb1 | | | My only way to fix this, is either rebooting the machine, unmounting | the volume | and mounting it back online. I've been rsyncing data twice between | this server and another | one. The rsync process takes about 30m-1hour. After the rsync | operation I see that | the xfsaild is stuck in D State and my Load is near 1.00. | | | I'v had no problem what-so-ever on centos 6.3 or early versions on | other servers. | | Any thoughts ? | Information, help would be much appreciated. | | Thanks in advance. | | Best regards, | | Svavar | | | | _______________________________________________ | CentOS mailing list | CentOS@centos.org | http://lists.centos.org/mailman/listinfo/centos |
Indeed. We have the same issue. We stuck to a previous kernel release that we know doesn't have the bug to work around it. Each maintenance window we revisit this until we're sure the bug is solved at which point we will remove the workaround. I really feel for ya and was just pointing out that other than the load average because it's spinning it's wheels that there is no fallout from it. ;)
----- Original Message ----- | James A. Peltier [jpeltier@sfu.ca] wrote: | > | > It's a kernel bug and can safely be ignored. | | Unfortunately it causes problems on systems that use load average as | a metric - so can't be ignored by everyone :-) | | James Pearson | | | _______________________________________________ | CentOS mailing list | CentOS@centos.org | http://lists.centos.org/mailman/listinfo/centos |