I'm having an occasional problem with a box. It's a Supermicro 16-core Xeon, running CentOS 6.3 with kernel 2.6.32-279.el6.x86_64, 96 gigs of RAM, and an Areca 1882ix-24 RAID controller with 24 disks, 23 in RAID6 plus a hot spare. The RAID is divided into 3 partitions, two of 25 TB plus one for the rest.
Lately, I've noticed sporadic hangs on writing to the RAID, which "resolve" themselves with the following message in dmesg:
INFO: task jbd2/sdb2-8:3607 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/sdb2-8 D 000000000000000a 0 3607 2 0x00000080 ffff881055d03d20 0000000000000046 ffff881055d03ca0 ffffffff811ada77 ffff8810552e2400 ffff88109c6566e8 0000000000002f38 ffff8810546e1540 ffff8810546e1af8 ffff881055d03fd8 000000000000fb88 ffff8810546e1af8 Call Trace: [<ffffffff811ada77>] ? __set_page_dirty+0x87/0xf0 [<ffffffff810923be>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa01e881f>] jbd2_journal_commit_transaction+0x19f/0x14b0 [jbd2] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff8107e00c>] ? lock_timer_base+0x3c/0x70 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01eef78>] kjournald2+0xb8/0x220 [jbd2] [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01eeec0>] ? kjournald2+0x0/0x220 [jbd2] [<ffffffff81091d66>] kthread+0x96/0xa0 [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffff81091cd0>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20
I get two of these messages in close succession, then things proceed normally. It doesn't happen often, and only under load. I'm uncertain, but think it might just happen once per boot, once it's happened, it doesn't repeat until rebooting. I'm not sure of this, however.
Things otherwise seem fine, other logs, RAID controller event logs, RAID controller physical disk status and health reports are normal, etc. After getting this error once, attempting to re-write the exact same file on the filesystem gets no errors and works normally, so I doubt it's a physical thing with the drives anyway.
Anyone seen anything like this before? It's not very frequent, but it's very annoying.
On 03/26/2013 02:01 PM, Joakim Ziegler wrote:
Anyone seen anything like this before? It's not very frequent, but it's very annoying.
I haven't, but the first thing I'd do in the situation you describe is update the firmware on the RAID card.
I looked around at other discussions of the same error, and it looks like some people have also resolved that problem by switching to the deadline scheduler
# echo noop > /sys/block/sdb/queue/scheduler
...but I wouldn't do that unless you update the firmware and still see the error.
Ok, will try the firmware first. I saw some talk of the scheduler, but I was uncertain if that applied in my case. By the way, doesn't the command you include switch to the noop scheduler? Shouldn't it be "echo deadline"?
Thanks, just checking. :)
I've had a chance to test this a bit more. Updating the firmware on the controller had no effect, so I tried changing the scheduler, but that doesn't seem to work either. I have confirmed, however, that this happens exactly once per boot. I can test it by doing something like:
dd if=/dev/zero of=test1.bin bs=16777216 count=1024 conv=sync
On one of the partitions on the RAID. It'll run for a little while, then pause when the file is always exactly 9913094144 bytes (this is close enough to 10% of the system RAM that I think it's hitting some sort of cache flush threshold), spit out this in dmesg:
INFO: task jbd2/sdb2-8:3517 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/sdb2-8 D 0000000000000011 0 3517 2 0x00000000 ffff881847497d20 0000000000000046 0000000000000000 ffff881847497ce4 ffff881000000000 ffff88107fc2a600 ffff880028216680 0000000000000400 ffff8818654e2638 ffff881847497fd8 000000000000fb88 ffff8818654e2638 Call Trace: [<ffffffffa01e681f>] jbd2_journal_commit_transaction+0x19f/0x14b0 [jbd2] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff8107e00c>] ? lock_timer_base+0x3c/0x70 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01ecf78>] kjournald2+0xb8/0x220 [jbd2] [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa01ecec0>] ? kjournald2+0x0/0x220 [jbd2] [<ffffffff81091d66>] kthread+0x96/0xa0 [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffff81091cd0>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20 INFO: task dd:5837 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. dd D 0000000000000000 0 5837 5744 0x00000000 ffff881822587740 0000000000000082 0000000000000000 0000000000000011 00000040ffffffff 0000000000000000 ffff88000003bc68 0000000267b7b578 ffff88186383b098 ffff881822587fd8 000000000000fb88 ffff88186383b098 Call Trace: [<ffffffff814ffd95>] rwsem_down_failed_common+0x95/0x1d0 [<ffffffff81161e5e>] ? kmem_getpages+0x15e/0x170 [<ffffffff814fff26>] rwsem_down_read_failed+0x26/0x30 [<ffffffff8127e534>] call_rwsem_down_read_failed+0x14/0x30 [<ffffffff814ff424>] ? down_read+0x24/0x30 [<ffffffffa0217c96>] ext4_get_blocks+0x56/0x2a0 [ext4] [<ffffffffa0217f67>] ext4_da_get_block_prep+0x87/0x3c0 [ext4] [<ffffffff811ad226>] ? alloc_page_buffers+0x86/0xf0 [<ffffffff811afe1b>] __block_prepare_write+0x1db/0x570 [<ffffffffa0217ee0>] ? ext4_da_get_block_prep+0x0/0x3c0 [ext4] [<ffffffff811b048c>] block_write_begin_newtrunc+0x5c/0xd0 [<ffffffff811b0893>] block_write_begin+0x43/0x90 [<ffffffffa0217ee0>] ? ext4_da_get_block_prep+0x0/0x3c0 [ext4] [<ffffffffa021ca2d>] ext4_da_write_begin+0xed/0x200 [ext4] [<ffffffffa0217ee0>] ? ext4_da_get_block_prep+0x0/0x3c0 [ext4] [<ffffffff81114ab3>] generic_file_buffered_write+0x123/0x2e0 [<ffffffffa0216fdf>] ? ext4_dirty_inode+0x4f/0x60 [ext4] [<ffffffff81116450>] __generic_file_aio_write+0x250/0x480 [<ffffffff8105b483>] ? perf_event_task_sched_out+0x33/0x80 [<ffffffff811166ef>] generic_file_aio_write+0x6f/0xe0 [<ffffffffa0211131>] ext4_file_write+0x61/0x1e0 [ext4] [<ffffffff8117ad6a>] do_sync_write+0xfa/0x140 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8127e842>] ? __clear_user+0x42/0x70 [<ffffffff8127e821>] ? __clear_user+0x21/0x70 [<ffffffff81213136>] ? security_file_permission+0x16/0x20 [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 [<ffffffff8117ba81>] sys_write+0x51/0x90 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Hang a little more, then repeat the exact same error in dmesg, hang a bit more, repeat the same error, and then quickly finish.
After this, the filesystems are fine, writing more files of 16 GB finishes in about 13 seconds each, which is the write speed I've benchmarked the RAID to before, and the filesystems are stable for at least a day or two with no problems.
Two interesting data points:
- Changing the scheduler to "deadline" seems to have had no practical effect, other than making the error appear 3 times before the problem fixes itself. When I was using cfq, the error message repeated twice, not three times, and it seemed to resolve itself a bit faster.
- All the time the write process is hanging, the activity LEDs on the disk array show signs of intense activity, at least as intense as when the file is actually being written, so it seems to me it's not "hanging" as such, it's doing something or other, I just don't know what.
On 03/30/2013 06:33 AM, Joakim Ziegler wrote:
On one of the partitions on the RAID. It'll run for a little while, then pause when the file is always exactly 9913094144 bytes (this is close enough to 10% of the system RAM that I think it's hitting some sort of cache flush threshold), spit out this in dmesg:
How much RAM does the RAID card have?
4GB.