[CentOS] Odd INFO "120 seconds" in logs for 2.6.18-194.3.1

Tue Jun 8 19:08:38 UTC 2010
Dianne Yumul <dianne at wellsgaming.com>

Hello,

I'm getting the same thing on one of our servers since upgrading to CentOS 5.5:

INFO: task pdflush:21249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
pdflush       D 00001EE1  3540 21249     11               21226 (L-TLB)
       f0b59f24 00000046 af2b024f 00001ee1 c04ec500 00000000 c041e314 0000000a 
       f76ae550 af2b2566 00001ee1 00002317 00000001 f76ae65c c180dc44 c198e200 
       00000000 c180e5e4 f0b59fa8 f76ae550 c180dc44 c061bbc0 f76ae550 ffffffff 
Call Trace:
 [<c04ec500>] __next_cpu+0x12/0x21
 [<c041e314>] find_busiest_group+0x177/0x462
 [<c061bbc0>] schedule+0xbc/0xa55
 [<c061d981>] rwsem_down_read_failed+0x128/0x143
 [<c04389ef>] .text.lock.rwsem+0x35/0x3a
 [<c047c014>] sync_supers+0x2f/0xb8
 [<c045df9c>] wb_kupdate+0x36/0x10f
 [<c045e431>] pdflush+0x0/0x1a3
 [<c045e53c>] pdflush+0x10b/0x1a3
 [<c045df66>] wb_kupdate+0x0/0x10f
 [<c0435f43>] kthread+0xc0/0xed
 [<c0435e83>] kthread+0x0/0xed
 [<c0405c53>] kernel_thread_helper+0x7/0x10

From the bugs already filed, it seems to happen to many (or any?) processes and some notice hangups and performance drops.  But our system seems okay, probably because it has low traffic and is mostly idle.  But I'll still reboot to the previous kernel version tonight.

dianne

On Jun 8, 2010, at 1:04 AM, Ireneusz Piasecki wrote:

>  W dniu 2010-06-08 09:54, Tsuyoshi Nagata pisze:
>> Hi
>> (2010/06/08 5:12), Steve Brooks wrote:
>>> Jun  7 19:45:21 sraid3 kernel:  [<ffffffff800ec2a2>] inode_wait+0x0/0xd
>>> Jun  7 19:45:21 sraid3 kernel:  [<ffffffff80063ab0>]
>>> out_of_line_wait_on_bit+0x6c/0x78
>>> Jun  7 19:45:21 sraid3 kernel:  [<ffffffff800a0aec>]
>>> wake_bit_function+0x0/0x23
>>> Jun  7 19:45:21 sraid3 kernel:  [<ffffffff8003dbbf>] ifind_fast+0x6e/0x83
>> This message was created at Linux/fs/inode.c:ifind_fast()
>> The source code was bellows,
>> 
>> Linux/fs/inode.c:
>> 912 static struct inode *ifind_fast(struct super_block *sb,
>> 913                 struct hlist_head *head, unsigned long ino)
>> 914 {
>> 915         struct inode *inode;
>> 916
>> 917  *LOCK* spin_lock(&inode_lock);<= This takes
>> 918         inode = find_inode_fast(sb, head, ino);<=  more 120s.
>> 919         if (inode) {
>> 920                 __iget(inode);
>> 921   *UNLOCK*      spin_unlock(&inode_lock);
>> 922                 wait_on_inode(inode);
>> 923                 return inode;
>> 924         }
>> 925         spin_unlock(&inode_lock);
>> 926         return NULL;
>> 927 }
>> 928
>> 
>> I guess your your file system has a trouble with i-node(file number) resources.
>> CAUSES:
>>        Hard Disk trouble (bit error/raid trouble.)
>>        i-node trouble (overflow. etc.)
>>        Memory/CPU trouble(&inode_lock)
>> 
>> Buy Fresh Hard disks&  rebuild them is convenience way.
>> Or memtest86 can finds DIMM trouble.(or CPU, mother board)
>> Or ext4 bug in 194.3.1 kernel, back to ext3!
>> 
> Ok, then i will test all of my centos 5.5 32 nodes: cpu, ram, disks etc. 
> This came with the kernel of Centos 5.5. Before there was'nt such 
> errors/warrning. Redhat bugizilla: 
> https://bugzilla.redhat.com/show_bug.cgi?id=573106
> 
> I.Piasecki
> 
>> -tsuyoshi
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>> 
> 
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos
>