Hello Everyone,
Since rebooting my Centos 6.10 Openvz server "daisy" yesterday, I am getting horrible system performance. /var/log/messages is full of HDIO_GET_IDENTITY failed for /dev/sdb. The latest entries look like this:
Apr 22 08:51:32 daisy kernel: [141224.655699] CT: 1005: stopped Apr 22 08:55:04 daisy ata_id[21513]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:00:05 daisy ata_id[21584]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:05:02 daisy ata_id[21644]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:10:01 daisy ata_id[22282]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:11:49 daisy kernel: [142441.721065] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:11:49 daisy kernel: [142441.721083] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:11:49 daisy kernel: [142441.721093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:11:49 daisy kernel: [142441.721109] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:11:49 daisy kernel: [142441.721115] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:11:49 daisy kernel: [142441.721121] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:11:49 daisy kernel: [142441.721125] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:11:49 daisy kernel: [142441.721130] Call Trace: Apr 22 09:11:49 daisy kernel: [142441.721139] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:11:49 daisy kernel: [142441.721144] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:11:49 daisy kernel: [142441.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:11:49 daisy kernel: [142441.721155] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721159] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:11:49 daisy kernel: [142441.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:11:49 daisy kernel: [142441.721167] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:11:49 daisy kernel: [142441.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:11:49 daisy kernel: [142441.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:11:49 daisy kernel: [142441.721181] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:11:49 daisy kernel: [142441.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:11:49 daisy kernel: [142441.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:13:49 daisy kernel: [142561.721069] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:13:49 daisy kernel: [142561.721087] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:13:49 daisy kernel: [142561.721096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:13:49 daisy kernel: [142561.721112] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:13:49 daisy kernel: [142561.721118] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:13:49 daisy kernel: [142561.721123] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:13:49 daisy kernel: [142561.721128] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:13:49 daisy kernel: [142561.721133] Call Trace: Apr 22 09:13:49 daisy kernel: [142561.721142] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:13:49 daisy kernel: [142561.721148] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:13:49 daisy kernel: [142561.721153] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:13:49 daisy kernel: [142561.721158] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721162] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:13:49 daisy kernel: [142561.721166] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:13:49 daisy kernel: [142561.721170] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:13:49 daisy kernel: [142561.721176] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:13:49 daisy kernel: [142561.721180] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:13:49 daisy kernel: [142561.721184] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721188] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:13:49 daisy kernel: [142561.721192] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:13:49 daisy kernel: [142561.721196] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:15:06 daisy ata_id[22299]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:15:49 daisy kernel: [142681.721085] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:15:49 daisy kernel: [142681.721104] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:15:49 daisy kernel: [142681.721113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:15:49 daisy kernel: [142681.721129] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:15:49 daisy kernel: [142681.721136] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:15:49 daisy kernel: [142681.721141] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:15:49 daisy kernel: [142681.721146] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:15:49 daisy kernel: [142681.721150] Call Trace: Apr 22 09:15:49 daisy kernel: [142681.721160] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:15:49 daisy kernel: [142681.721166] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:15:49 daisy kernel: [142681.721172] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:15:49 daisy kernel: [142681.721178] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721182] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:15:49 daisy kernel: [142681.721185] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:15:49 daisy kernel: [142681.721190] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:15:49 daisy kernel: [142681.721196] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:15:49 daisy kernel: [142681.721200] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:15:49 daisy kernel: [142681.721204] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721208] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:15:49 daisy kernel: [142681.721212] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:15:49 daisy kernel: [142681.721217] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:17:49 daisy kernel: [142801.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:17:49 daisy kernel: [142801.721082] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:17:49 daisy kernel: [142801.721091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:17:49 daisy kernel: [142801.721107] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:17:49 daisy kernel: [142801.721114] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:17:49 daisy kernel: [142801.721119] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:17:49 daisy kernel: [142801.721124] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:17:49 daisy kernel: [142801.721128] Call Trace: Apr 22 09:17:49 daisy kernel: [142801.721137] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:17:49 daisy kernel: [142801.721143] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:17:49 daisy kernel: [142801.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:17:49 daisy kernel: [142801.721154] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721158] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:17:49 daisy kernel: [142801.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:17:49 daisy kernel: [142801.721166] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:17:49 daisy kernel: [142801.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:17:49 daisy kernel: [142801.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:17:49 daisy kernel: [142801.721180] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:17:49 daisy kernel: [142801.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:17:49 daisy kernel: [142801.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:20:01 daisy ata_id[22405]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:21:49 daisy kernel: [143041.721494] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:21:49 daisy kernel: [143041.721512] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:21:49 daisy kernel: [143041.721522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:21:49 daisy kernel: [143041.721691] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:21:49 daisy kernel: [143041.721697] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:21:49 daisy kernel: [143041.721702] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:21:49 daisy kernel: [143041.721706] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:21:49 daisy kernel: [143041.721711] Call Trace: Apr 22 09:21:49 daisy kernel: [143041.721720] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:21:49 daisy kernel: [143041.721726] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:21:49 daisy kernel: [143041.721730] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:21:49 daisy kernel: [143041.721735] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:21:49 daisy kernel: [143041.721739] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:21:49 daisy kernel: [143041.721743] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:21:49 daisy kernel: [143041.721747] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:21:49 daisy kernel: [143041.721753] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:21:49 daisy kernel: [143041.721757] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:21:49 daisy kernel: [143041.721762] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:21:49 daisy kernel: [143041.721765] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:21:49 daisy kernel: [143041.721769] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:21:49 daisy kernel: [143041.721773] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:23:49 daisy kernel: [143161.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:23:49 daisy kernel: [143161.721169] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:23:49 daisy kernel: [143161.721259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:23:49 daisy kernel: [143161.721430] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:23:49 daisy kernel: [143161.721437] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:23:49 daisy kernel: [143161.721442] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:23:49 daisy kernel: [143161.721447] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:23:49 daisy kernel: [143161.721451] Call Trace: Apr 22 09:23:49 daisy kernel: [143161.721460] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:23:49 daisy kernel: [143161.721466] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:23:49 daisy kernel: [143161.721470] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:23:49 daisy kernel: [143161.721475] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:23:49 daisy kernel: [143161.721479] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:23:49 daisy kernel: [143161.721483] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:23:49 daisy kernel: [143161.721487] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:23:49 daisy kernel: [143161.721493] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:23:49 daisy kernel: [143161.721498] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:23:49 daisy kernel: [143161.721502] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:23:49 daisy kernel: [143161.721506] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:23:49 daisy kernel: [143161.721510] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:23:49 daisy kernel: [143161.721514] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:25:02 daisy ata_id[22445]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:25:49 daisy kernel: [143281.721066] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:25:49 daisy kernel: [143281.721159] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:25:49 daisy kernel: [143281.721244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:25:49 daisy kernel: [143281.721408] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:25:49 daisy kernel: [143281.721415] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:25:49 daisy kernel: [143281.721420] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:25:49 daisy kernel: [143281.721424] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:25:49 daisy kernel: [143281.721429] Call Trace: Apr 22 09:25:49 daisy kernel: [143281.721438] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:25:49 daisy kernel: [143281.721444] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:25:49 daisy kernel: [143281.721448] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:25:49 daisy kernel: [143281.721453] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:25:49 daisy kernel: [143281.721457] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:25:49 daisy kernel: [143281.721461] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:25:49 daisy kernel: [143281.721465] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:25:49 daisy kernel: [143281.721471] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:25:49 daisy kernel: [143281.721476] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:25:49 daisy kernel: [143281.721480] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:25:49 daisy kernel: [143281.721484] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:25:49 daisy kernel: [143281.721487] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:25:49 daisy kernel: [143281.721492] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:27:49 daisy kernel: [143401.721072] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:27:49 daisy kernel: [143401.721165] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:27:49 daisy kernel: [143401.721253] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:27:49 daisy kernel: [143401.721421] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:27:49 daisy kernel: [143401.721427] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:27:49 daisy kernel: [143401.721432] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:27:49 daisy kernel: [143401.721436] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:27:49 daisy kernel: [143401.721441] Call Trace: Apr 22 09:27:49 daisy kernel: [143401.721450] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:27:49 daisy kernel: [143401.721456] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:27:49 daisy kernel: [143401.721460] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:27:49 daisy kernel: [143401.721465] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:27:49 daisy kernel: [143401.721469] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:27:49 daisy kernel: [143401.721473] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:27:49 daisy kernel: [143401.721477] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:27:49 daisy kernel: [143401.721483] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:27:49 daisy kernel: [143401.721487] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:27:49 daisy kernel: [143401.721492] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:27:49 daisy kernel: [143401.721495] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:27:49 daisy kernel: [143401.721499] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:27:49 daisy kernel: [143401.721503] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:29:49 daisy kernel: [143521.721059] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:29:49 daisy kernel: [143521.721158] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:29:49 daisy kernel: [143521.721245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:29:49 daisy kernel: [143521.721415] hdparm D ffff88000c778300 0 22246 20845 0 0x00000084 Apr 22 09:29:49 daisy kernel: [143521.721421] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:29:49 daisy kernel: [143521.721426] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:29:49 daisy kernel: [143521.721431] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:29:49 daisy kernel: [143521.721436] Call Trace: Apr 22 09:29:49 daisy kernel: [143521.721445] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:29:49 daisy kernel: [143521.721451] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:29:49 daisy kernel: [143521.721455] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:29:49 daisy kernel: [143521.721460] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:29:49 daisy kernel: [143521.721465] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:29:49 daisy kernel: [143521.721469] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:29:49 daisy kernel: [143521.721473] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:29:49 daisy kernel: [143521.721479] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:29:49 daisy kernel: [143521.721483] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:29:49 daisy kernel: [143521.721487] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:29:49 daisy kernel: [143521.721491] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:29:49 daisy kernel: [143521.721495] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:29:49 daisy kernel: [143521.721499] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:30:04 daisy ata_id[22489]: HDIO_GET_IDENTITY failed for '/dev/sdb' ------------------ I tried running hdparm -tT /dev/sda, but after waiting 5+ minutes for any command output I cancelled it.
I am rsyncing the data from this system over to another system now, clearly something is wrong, but I can't tell what.
The system is an older AMD Opteron 180 processor (dual core) 4 GB ram, RAID controller with RAID 5 set up with 4x 4TB Western Digital Drives.
I rebooted the system day before yesterday, and that's when the timeout messages started pouring into the log.
when I run tw_cli /c8 show, all four drives say they are ok [root@daisy cron.daily]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
Logical Volumes appear active: [root@daisy cron.daily]# lvscan ACTIVE '/dev/vg_daisy/lv_root' [10.89 TiB] inherit ACTIVE '/dev/vg_daisy/lv_swap' [3.88 GiB] inherit ACTIVE '/dev/vg_daisy/lv_home' [20.00 GiB] inherit [root@daisy cron.daily]#
[root@daisy cron.daily]# lvmdiskscan /dev/ram0 [ 16.00 MiB] /dev/root [ 10.89 TiB] /dev/ram1 [ 16.00 MiB] /dev/sda1 [ 2.82 TiB] /dev/vg_daisy/lv_swap [ 3.88 GiB] /dev/ram2 [ 16.00 MiB] /dev/vg_daisy/lv_home [ 20.00 GiB] /dev/ram3 [ 16.00 MiB] /dev/sda3 [ 842.87 GiB] /dev/ram4 [ 16.00 MiB] /dev/ram5 [ 16.00 MiB] /dev/ram6 [ 16.00 MiB] /dev/ram7 [ 16.00 MiB] /dev/ram8 [ 16.00 MiB] /dev/ram9 [ 16.00 MiB] /dev/ram10 [ 16.00 MiB] /dev/ram11 [ 16.00 MiB] /dev/ram12 [ 16.00 MiB] /dev/ram13 [ 16.00 MiB] /dev/ram14 [ 16.00 MiB] /dev/ram15 [ 16.00 MiB] /dev/sdb1 [ 1.82 TiB] LVM physical volume /dev/sdc1 [ 500.00 MiB] /dev/sdc2 [ 4.00 TiB] LVM physical volume /dev/sdd1 [ 4.00 TiB] LVM physical volume /dev/sde1 [ 2.91 TiB] LVM physical volume 3 disks 19 partitions 0 LVM physical volume whole disks 4 LVM physical volumes [root@daisy cron.daily]#
grub.conf: [root@daisy grub]# cat grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/mapper/vg_daisy-lv_root # initrd /initrd-[generic-]version.img #boot=/dev/sdb default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title OpenVZ (2.6.32-042stab142.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab142.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab142.1.img title OpenVZ (2.6.32-042stab141.3) root (hd0,0) kernel /vmlinuz-2.6.32-042stab141.3 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab141.3.img title OpenVZ (2.6.32-042stab140.4) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.4 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.4.img title OpenVZ (2.6.32-042stab140.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.1.img title OpenVZ (2.6.32-042stab139.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab139.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab139.1.img title CentOS 6 (2.6.32-754.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-754.el6.x86_64.img -------------
Top is not showing anything out of the ordinary: ---------- [root@daisy grub]#
top - 09:41:57 up 1 day, 16:04, 3 users, load average: 5.89, 5.83, 5.43 Tasks: 369 total, 1 running, 368 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 1.2%sy, 0.0%ni, 25.0%id, 73.5%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3894628k total, 3861280k used, 33348k free, 95608k buffers Swap: 4063228k total, 34888k used, 4028340k free, 3139272k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1266 root 20 0 0 0 0 D 1.0 0.0 12:27.75 flush-253:0 21041 1153 20 0 3188 1840 1012 D 0.7 0.0 0:00.72 imap 21599 97 20 0 5160 1940 1568 S 0.7 0.0 0:01.06 imap-login 22636 root 20 0 15272 1524 964 R 0.7 0.0 0:00.06 top 1977 root 20 0 2096 644 360 S 0.3 0.0 0:27.92 dovecot 22528 97 20 0 5160 2044 1672 S 0.3 0.1 0:00.35 imap-login 22578 1155 20 0 2904 1528 940 D 0.3 0.0 0:00.22 imap 1 root 20 0 19236 268 136 S 0.0 0.0 0:00.68 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.04 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:01.88 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 9 root 20 0 0 0 0 S 0.0 0.0 0:03.17 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 0:00.20 watchdog/1 11 root 20 0 0 0 0 S 0.0 0.0 0:07.23 events/0 12 root 20 0 0 0 0 S 0.0 0.0 0:08.55 events/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/0 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/1 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/1 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 21 root 20 0 0 0 0 S 0.0 0.0 0:00.01 netns 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 24 root 20 0 0 0 0 S 0.0 0.0 0:00.29 sync_supers ------------ This is a company production mail server, and I can't find the solution, I need help, as soon as someone is able, thank you!
Hello Everyone,
Since rebooting my Centos 6.10 Openvz server "daisy" yesterday, I am getting horrible system performance. /var/log/messages is full of HDIO_GET_IDENTITY failed for /dev/sdb. The latest entries look like this:
Apr 22 08:51:32 daisy kernel: [141224.655699] CT: 1005: stopped Apr 22 08:55:04 daisy ata_id[21513]: HDIO_GET_IDENTITY failed for '/dev/sdb'
Hi,
You didn't tell us much about your hardware. It seems you're running a 3ware controller in it.
What exactly is /dev/sdb? I don't know 3ware so I can only guess but it looks to me like at least one of your drives is having problems.
Any chance you can check SMART status on the disks or have it do a SMART test? I guess your 3ware controller has to initiate this.
Regards, Simon
Apr 22 09:00:05 daisy ata_id[21584]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:05:02 daisy ata_id[21644]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:10:01 daisy ata_id[22282]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:11:49 daisy kernel: [142441.721065] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:11:49 daisy kernel: [142441.721083] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:11:49 daisy kernel: [142441.721093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:11:49 daisy kernel: [142441.721109] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:11:49 daisy kernel: [142441.721115] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:11:49 daisy kernel: [142441.721121] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:11:49 daisy kernel: [142441.721125] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:11:49 daisy kernel: [142441.721130] Call Trace: Apr 22 09:11:49 daisy kernel: [142441.721139] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:11:49 daisy kernel: [142441.721144] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:11:49 daisy kernel: [142441.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:11:49 daisy kernel: [142441.721155] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721159] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:11:49 daisy kernel: [142441.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:11:49 daisy kernel: [142441.721167] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:11:49 daisy kernel: [142441.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:11:49 daisy kernel: [142441.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:11:49 daisy kernel: [142441.721181] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:11:49 daisy kernel: [142441.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:11:49 daisy kernel: [142441.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:13:49 daisy kernel: [142561.721069] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:13:49 daisy kernel: [142561.721087] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:13:49 daisy kernel: [142561.721096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:13:49 daisy kernel: [142561.721112] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:13:49 daisy kernel: [142561.721118] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:13:49 daisy kernel: [142561.721123] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:13:49 daisy kernel: [142561.721128] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:13:49 daisy kernel: [142561.721133] Call Trace: Apr 22 09:13:49 daisy kernel: [142561.721142] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:13:49 daisy kernel: [142561.721148] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:13:49 daisy kernel: [142561.721153] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:13:49 daisy kernel: [142561.721158] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721162] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:13:49 daisy kernel: [142561.721166] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:13:49 daisy kernel: [142561.721170] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:13:49 daisy kernel: [142561.721176] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:13:49 daisy kernel: [142561.721180] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:13:49 daisy kernel: [142561.721184] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721188] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:13:49 daisy kernel: [142561.721192] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:13:49 daisy kernel: [142561.721196] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:15:06 daisy ata_id[22299]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:15:49 daisy kernel: [142681.721085] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:15:49 daisy kernel: [142681.721104] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:15:49 daisy kernel: [142681.721113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:15:49 daisy kernel: [142681.721129] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:15:49 daisy kernel: [142681.721136] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:15:49 daisy kernel: [142681.721141] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:15:49 daisy kernel: [142681.721146] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:15:49 daisy kernel: [142681.721150] Call Trace: Apr 22 09:15:49 daisy kernel: [142681.721160] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:15:49 daisy kernel: [142681.721166] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:15:49 daisy kernel: [142681.721172] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:15:49 daisy kernel: [142681.721178] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721182] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:15:49 daisy kernel: [142681.721185] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:15:49 daisy kernel: [142681.721190] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:15:49 daisy kernel: [142681.721196] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:15:49 daisy kernel: [142681.721200] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:15:49 daisy kernel: [142681.721204] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721208] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:15:49 daisy kernel: [142681.721212] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:15:49 daisy kernel: [142681.721217] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:17:49 daisy kernel: [142801.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:17:49 daisy kernel: [142801.721082] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:17:49 daisy kernel: [142801.721091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:17:49 daisy kernel: [142801.721107] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:17:49 daisy kernel: [142801.721114] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:17:49 daisy kernel: [142801.721119] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:17:49 daisy kernel: [142801.721124] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:17:49 daisy kernel: [142801.721128] Call Trace: Apr 22 09:17:49 daisy kernel: [142801.721137] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:17:49 daisy kernel: [142801.721143] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:17:49 daisy kernel: [142801.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:17:49 daisy kernel: [142801.721154] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721158] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:17:49 daisy kernel: [142801.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:17:49 daisy kernel: [142801.721166] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:17:49 daisy kernel: [142801.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:17:49 daisy kernel: [142801.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:17:49 daisy kernel: [142801.721180] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:17:49 daisy kernel: [142801.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:17:49 daisy kernel: [142801.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:20:01 daisy ata_id[22405]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:21:49 daisy kernel: [143041.721494] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:21:49 daisy kernel: [143041.721512] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:21:49 daisy kernel: [143041.721522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:21:49 daisy kernel: [143041.721691] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:21:49 daisy kernel: [143041.721697] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:21:49 daisy kernel: [143041.721702] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:21:49 daisy kernel: [143041.721706] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:21:49 daisy kernel: [143041.721711] Call Trace: Apr 22 09:21:49 daisy kernel: [143041.721720] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:21:49 daisy kernel: [143041.721726] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:21:49 daisy kernel: [143041.721730] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:21:49 daisy kernel: [143041.721735] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:21:49 daisy kernel: [143041.721739] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:21:49 daisy kernel: [143041.721743] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:21:49 daisy kernel: [143041.721747] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:21:49 daisy kernel: [143041.721753] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:21:49 daisy kernel: [143041.721757] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:21:49 daisy kernel: [143041.721762] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:21:49 daisy kernel: [143041.721765] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:21:49 daisy kernel: [143041.721769] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:21:49 daisy kernel: [143041.721773] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:23:49 daisy kernel: [143161.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:23:49 daisy kernel: [143161.721169] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:23:49 daisy kernel: [143161.721259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:23:49 daisy kernel: [143161.721430] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:23:49 daisy kernel: [143161.721437] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:23:49 daisy kernel: [143161.721442] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:23:49 daisy kernel: [143161.721447] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:23:49 daisy kernel: [143161.721451] Call Trace: Apr 22 09:23:49 daisy kernel: [143161.721460] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:23:49 daisy kernel: [143161.721466] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:23:49 daisy kernel: [143161.721470] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:23:49 daisy kernel: [143161.721475] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:23:49 daisy kernel: [143161.721479] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:23:49 daisy kernel: [143161.721483] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:23:49 daisy kernel: [143161.721487] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:23:49 daisy kernel: [143161.721493] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:23:49 daisy kernel: [143161.721498] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:23:49 daisy kernel: [143161.721502] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:23:49 daisy kernel: [143161.721506] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:23:49 daisy kernel: [143161.721510] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:23:49 daisy kernel: [143161.721514] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:25:02 daisy ata_id[22445]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:25:49 daisy kernel: [143281.721066] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:25:49 daisy kernel: [143281.721159] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:25:49 daisy kernel: [143281.721244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:25:49 daisy kernel: [143281.721408] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:25:49 daisy kernel: [143281.721415] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:25:49 daisy kernel: [143281.721420] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:25:49 daisy kernel: [143281.721424] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:25:49 daisy kernel: [143281.721429] Call Trace: Apr 22 09:25:49 daisy kernel: [143281.721438] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:25:49 daisy kernel: [143281.721444] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:25:49 daisy kernel: [143281.721448] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:25:49 daisy kernel: [143281.721453] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:25:49 daisy kernel: [143281.721457] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:25:49 daisy kernel: [143281.721461] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:25:49 daisy kernel: [143281.721465] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:25:49 daisy kernel: [143281.721471] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:25:49 daisy kernel: [143281.721476] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:25:49 daisy kernel: [143281.721480] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:25:49 daisy kernel: [143281.721484] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:25:49 daisy kernel: [143281.721487] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:25:49 daisy kernel: [143281.721492] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:27:49 daisy kernel: [143401.721072] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:27:49 daisy kernel: [143401.721165] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:27:49 daisy kernel: [143401.721253] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:27:49 daisy kernel: [143401.721421] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:27:49 daisy kernel: [143401.721427] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:27:49 daisy kernel: [143401.721432] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:27:49 daisy kernel: [143401.721436] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:27:49 daisy kernel: [143401.721441] Call Trace: Apr 22 09:27:49 daisy kernel: [143401.721450] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:27:49 daisy kernel: [143401.721456] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:27:49 daisy kernel: [143401.721460] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:27:49 daisy kernel: [143401.721465] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:27:49 daisy kernel: [143401.721469] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:27:49 daisy kernel: [143401.721473] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:27:49 daisy kernel: [143401.721477] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:27:49 daisy kernel: [143401.721483] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:27:49 daisy kernel: [143401.721487] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:27:49 daisy kernel: [143401.721492] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:27:49 daisy kernel: [143401.721495] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:27:49 daisy kernel: [143401.721499] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:27:49 daisy kernel: [143401.721503] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:29:49 daisy kernel: [143521.721059] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:29:49 daisy kernel: [143521.721158] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:29:49 daisy kernel: [143521.721245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:29:49 daisy kernel: [143521.721415] hdparm D ffff88000c778300 0 22246 20845 0 0x00000084 Apr 22 09:29:49 daisy kernel: [143521.721421] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:29:49 daisy kernel: [143521.721426] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:29:49 daisy kernel: [143521.721431] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:29:49 daisy kernel: [143521.721436] Call Trace: Apr 22 09:29:49 daisy kernel: [143521.721445] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:29:49 daisy kernel: [143521.721451] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:29:49 daisy kernel: [143521.721455] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:29:49 daisy kernel: [143521.721460] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:29:49 daisy kernel: [143521.721465] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:29:49 daisy kernel: [143521.721469] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:29:49 daisy kernel: [143521.721473] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:29:49 daisy kernel: [143521.721479] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:29:49 daisy kernel: [143521.721483] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:29:49 daisy kernel: [143521.721487] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:29:49 daisy kernel: [143521.721491] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:29:49 daisy kernel: [143521.721495] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:29:49 daisy kernel: [143521.721499] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:30:04 daisy ata_id[22489]: HDIO_GET_IDENTITY failed for '/dev/sdb'
I tried running hdparm -tT /dev/sda, but after waiting 5+ minutes for any command output I cancelled it.
I am rsyncing the data from this system over to another system now, clearly something is wrong, but I can't tell what.
The system is an older AMD Opteron 180 processor (dual core) 4 GB ram, RAID controller with RAID 5 set up with 4x 4TB Western Digital Drives.
I rebooted the system day before yesterday, and that's when the timeout messages started pouring into the log.
when I run tw_cli /c8 show, all four drives say they are ok [root@daisy cron.daily]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model
p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
Logical Volumes appear active: [root@daisy cron.daily]# lvscan ACTIVE '/dev/vg_daisy/lv_root' [10.89 TiB] inherit ACTIVE '/dev/vg_daisy/lv_swap' [3.88 GiB] inherit ACTIVE '/dev/vg_daisy/lv_home' [20.00 GiB] inherit [root@daisy cron.daily]#
[root@daisy cron.daily]# lvmdiskscan /dev/ram0 [ 16.00 MiB] /dev/root [ 10.89 TiB] /dev/ram1 [ 16.00 MiB] /dev/sda1 [ 2.82 TiB] /dev/vg_daisy/lv_swap [ 3.88 GiB] /dev/ram2 [ 16.00 MiB] /dev/vg_daisy/lv_home [ 20.00 GiB] /dev/ram3 [ 16.00 MiB] /dev/sda3 [ 842.87 GiB] /dev/ram4 [ 16.00 MiB] /dev/ram5 [ 16.00 MiB] /dev/ram6 [ 16.00 MiB] /dev/ram7 [ 16.00 MiB] /dev/ram8 [ 16.00 MiB] /dev/ram9 [ 16.00 MiB] /dev/ram10 [ 16.00 MiB] /dev/ram11 [ 16.00 MiB] /dev/ram12 [ 16.00 MiB] /dev/ram13 [ 16.00 MiB] /dev/ram14 [ 16.00 MiB] /dev/ram15 [ 16.00 MiB] /dev/sdb1 [ 1.82 TiB] LVM physical volume /dev/sdc1 [ 500.00 MiB] /dev/sdc2 [ 4.00 TiB] LVM physical volume /dev/sdd1 [ 4.00 TiB] LVM physical volume /dev/sde1 [ 2.91 TiB] LVM physical volume 3 disks 19 partitions 0 LVM physical volume whole disks 4 LVM physical volumes [root@daisy cron.daily]#
grub.conf: [root@daisy grub]# cat grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/mapper/vg_daisy-lv_root # initrd /initrd-[generic-]version.img #boot=/dev/sdb default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title OpenVZ (2.6.32-042stab142.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab142.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab142.1.img title OpenVZ (2.6.32-042stab141.3) root (hd0,0) kernel /vmlinuz-2.6.32-042stab141.3 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab141.3.img title OpenVZ (2.6.32-042stab140.4) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.4 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.4.img title OpenVZ (2.6.32-042stab140.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.1.img title OpenVZ (2.6.32-042stab139.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab139.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab139.1.img title CentOS 6 (2.6.32-754.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-754.el6.x86_64.img
Top is not showing anything out of the ordinary:
[root@daisy grub]#
top - 09:41:57 up 1 day, 16:04, 3 users, load average: 5.89, 5.83, 5.43 Tasks: 369 total, 1 running, 368 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 1.2%sy, 0.0%ni, 25.0%id, 73.5%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3894628k total, 3861280k used, 33348k free, 95608k buffers Swap: 4063228k total, 34888k used, 4028340k free, 3139272k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1266 root 20 0 0 0 0 D 1.0 0.0 12:27.75 flush-253:0 21041 1153 20 0 3188 1840 1012 D 0.7 0.0 0:00.72 imap 21599 97 20 0 5160 1940 1568 S 0.7 0.0 0:01.06 imap-login 22636 root 20 0 15272 1524 964 R 0.7 0.0 0:00.06 top 1977 root 20 0 2096 644 360 S 0.3 0.0 0:27.92 dovecot 22528 97 20 0 5160 2044 1672 S 0.3 0.1 0:00.35 imap-login 22578 1155 20 0 2904 1528 940 D 0.3 0.0 0:00.22 imap 1 root 20 0 19236 268 136 S 0.0 0.0 0:00.68 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.04 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:01.88 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 9 root 20 0 0 0 0 S 0.0 0.0 0:03.17 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 0:00.20 watchdog/1 11 root 20 0 0 0 0 S 0.0 0.0 0:07.23 events/0 12 root 20 0 0 0 0 S 0.0 0.0 0:08.55 events/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/0 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/1 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/1 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 21 root 20 0 0 0 0 S 0.0 0.0 0:00.01 netns 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 24 root 20 0 0 0 0 S 0.0 0.0 0:00.29 sync_supers
This is a company production mail server, and I can't find the solution, I need help, as soon as someone is able, thank you! _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Hello Everyone,
Since rebooting my Centos 6.10 Openvz server "daisy" yesterday, I am getting horrible system performance. /var/log/messages is full of HDIO_GET_IDENTITY failed for /dev/sdb. The latest entries look like this:
Apr 22 08:51:32 daisy kernel: [141224.655699] CT: 1005: stopped Apr 22 08:55:04 daisy ata_id[21513]: HDIO_GET_IDENTITY failed for '/dev/sdb'
Hi,
You didn't tell us much about your hardware. It seems you're running a 3ware controller in it.
What exactly is /dev/sdb? I don't know 3ware so I can only guess but it looks to me like at least one of your drives is having problems.
Any chance you can check SMART status on the disks or have it do a SMART test? I guess your 3ware controller has to initiate this.
And, what is the status of the BBU?
Looks like "tw_cli /cx/bbu show status" should tell.
Regards, Simon
Apr 22 09:00:05 daisy ata_id[21584]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:05:02 daisy ata_id[21644]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:10:01 daisy ata_id[22282]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:11:49 daisy kernel: [142441.721065] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:11:49 daisy kernel: [142441.721083] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:11:49 daisy kernel: [142441.721093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:11:49 daisy kernel: [142441.721109] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:11:49 daisy kernel: [142441.721115] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:11:49 daisy kernel: [142441.721121] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:11:49 daisy kernel: [142441.721125] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:11:49 daisy kernel: [142441.721130] Call Trace: Apr 22 09:11:49 daisy kernel: [142441.721139] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:11:49 daisy kernel: [142441.721144] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:11:49 daisy kernel: [142441.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:11:49 daisy kernel: [142441.721155] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721159] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:11:49 daisy kernel: [142441.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:11:49 daisy kernel: [142441.721167] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:11:49 daisy kernel: [142441.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:11:49 daisy kernel: [142441.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:11:49 daisy kernel: [142441.721181] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:11:49 daisy kernel: [142441.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:11:49 daisy kernel: [142441.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:13:49 daisy kernel: [142561.721069] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:13:49 daisy kernel: [142561.721087] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:13:49 daisy kernel: [142561.721096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:13:49 daisy kernel: [142561.721112] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:13:49 daisy kernel: [142561.721118] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:13:49 daisy kernel: [142561.721123] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:13:49 daisy kernel: [142561.721128] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:13:49 daisy kernel: [142561.721133] Call Trace: Apr 22 09:13:49 daisy kernel: [142561.721142] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:13:49 daisy kernel: [142561.721148] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:13:49 daisy kernel: [142561.721153] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:13:49 daisy kernel: [142561.721158] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721162] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:13:49 daisy kernel: [142561.721166] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:13:49 daisy kernel: [142561.721170] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:13:49 daisy kernel: [142561.721176] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:13:49 daisy kernel: [142561.721180] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:13:49 daisy kernel: [142561.721184] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721188] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:13:49 daisy kernel: [142561.721192] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:13:49 daisy kernel: [142561.721196] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:15:06 daisy ata_id[22299]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:15:49 daisy kernel: [142681.721085] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:15:49 daisy kernel: [142681.721104] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:15:49 daisy kernel: [142681.721113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:15:49 daisy kernel: [142681.721129] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:15:49 daisy kernel: [142681.721136] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:15:49 daisy kernel: [142681.721141] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:15:49 daisy kernel: [142681.721146] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:15:49 daisy kernel: [142681.721150] Call Trace: Apr 22 09:15:49 daisy kernel: [142681.721160] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:15:49 daisy kernel: [142681.721166] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:15:49 daisy kernel: [142681.721172] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:15:49 daisy kernel: [142681.721178] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721182] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:15:49 daisy kernel: [142681.721185] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:15:49 daisy kernel: [142681.721190] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:15:49 daisy kernel: [142681.721196] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:15:49 daisy kernel: [142681.721200] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:15:49 daisy kernel: [142681.721204] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721208] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:15:49 daisy kernel: [142681.721212] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:15:49 daisy kernel: [142681.721217] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:17:49 daisy kernel: [142801.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:17:49 daisy kernel: [142801.721082] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:17:49 daisy kernel: [142801.721091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:17:49 daisy kernel: [142801.721107] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:17:49 daisy kernel: [142801.721114] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:17:49 daisy kernel: [142801.721119] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:17:49 daisy kernel: [142801.721124] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:17:49 daisy kernel: [142801.721128] Call Trace: Apr 22 09:17:49 daisy kernel: [142801.721137] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:17:49 daisy kernel: [142801.721143] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:17:49 daisy kernel: [142801.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:17:49 daisy kernel: [142801.721154] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721158] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:17:49 daisy kernel: [142801.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:17:49 daisy kernel: [142801.721166] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:17:49 daisy kernel: [142801.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:17:49 daisy kernel: [142801.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:17:49 daisy kernel: [142801.721180] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:17:49 daisy kernel: [142801.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:17:49 daisy kernel: [142801.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:20:01 daisy ata_id[22405]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:21:49 daisy kernel: [143041.721494] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:21:49 daisy kernel: [143041.721512] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:21:49 daisy kernel: [143041.721522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:21:49 daisy kernel: [143041.721691] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:21:49 daisy kernel: [143041.721697] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:21:49 daisy kernel: [143041.721702] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:21:49 daisy kernel: [143041.721706] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:21:49 daisy kernel: [143041.721711] Call Trace: Apr 22 09:21:49 daisy kernel: [143041.721720] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:21:49 daisy kernel: [143041.721726] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:21:49 daisy kernel: [143041.721730] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:21:49 daisy kernel: [143041.721735] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:21:49 daisy kernel: [143041.721739] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:21:49 daisy kernel: [143041.721743] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:21:49 daisy kernel: [143041.721747] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:21:49 daisy kernel: [143041.721753] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:21:49 daisy kernel: [143041.721757] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:21:49 daisy kernel: [143041.721762] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:21:49 daisy kernel: [143041.721765] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:21:49 daisy kernel: [143041.721769] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:21:49 daisy kernel: [143041.721773] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:23:49 daisy kernel: [143161.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:23:49 daisy kernel: [143161.721169] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:23:49 daisy kernel: [143161.721259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:23:49 daisy kernel: [143161.721430] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:23:49 daisy kernel: [143161.721437] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:23:49 daisy kernel: [143161.721442] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:23:49 daisy kernel: [143161.721447] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:23:49 daisy kernel: [143161.721451] Call Trace: Apr 22 09:23:49 daisy kernel: [143161.721460] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:23:49 daisy kernel: [143161.721466] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:23:49 daisy kernel: [143161.721470] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:23:49 daisy kernel: [143161.721475] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:23:49 daisy kernel: [143161.721479] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:23:49 daisy kernel: [143161.721483] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:23:49 daisy kernel: [143161.721487] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:23:49 daisy kernel: [143161.721493] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:23:49 daisy kernel: [143161.721498] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:23:49 daisy kernel: [143161.721502] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:23:49 daisy kernel: [143161.721506] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:23:49 daisy kernel: [143161.721510] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:23:49 daisy kernel: [143161.721514] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:25:02 daisy ata_id[22445]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:25:49 daisy kernel: [143281.721066] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:25:49 daisy kernel: [143281.721159] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:25:49 daisy kernel: [143281.721244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:25:49 daisy kernel: [143281.721408] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:25:49 daisy kernel: [143281.721415] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:25:49 daisy kernel: [143281.721420] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:25:49 daisy kernel: [143281.721424] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:25:49 daisy kernel: [143281.721429] Call Trace: Apr 22 09:25:49 daisy kernel: [143281.721438] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:25:49 daisy kernel: [143281.721444] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:25:49 daisy kernel: [143281.721448] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:25:49 daisy kernel: [143281.721453] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:25:49 daisy kernel: [143281.721457] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:25:49 daisy kernel: [143281.721461] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:25:49 daisy kernel: [143281.721465] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:25:49 daisy kernel: [143281.721471] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:25:49 daisy kernel: [143281.721476] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:25:49 daisy kernel: [143281.721480] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:25:49 daisy kernel: [143281.721484] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:25:49 daisy kernel: [143281.721487] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:25:49 daisy kernel: [143281.721492] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:27:49 daisy kernel: [143401.721072] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:27:49 daisy kernel: [143401.721165] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:27:49 daisy kernel: [143401.721253] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:27:49 daisy kernel: [143401.721421] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:27:49 daisy kernel: [143401.721427] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:27:49 daisy kernel: [143401.721432] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:27:49 daisy kernel: [143401.721436] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:27:49 daisy kernel: [143401.721441] Call Trace: Apr 22 09:27:49 daisy kernel: [143401.721450] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:27:49 daisy kernel: [143401.721456] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:27:49 daisy kernel: [143401.721460] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:27:49 daisy kernel: [143401.721465] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:27:49 daisy kernel: [143401.721469] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:27:49 daisy kernel: [143401.721473] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:27:49 daisy kernel: [143401.721477] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:27:49 daisy kernel: [143401.721483] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:27:49 daisy kernel: [143401.721487] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:27:49 daisy kernel: [143401.721492] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:27:49 daisy kernel: [143401.721495] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:27:49 daisy kernel: [143401.721499] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:27:49 daisy kernel: [143401.721503] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:29:49 daisy kernel: [143521.721059] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:29:49 daisy kernel: [143521.721158] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:29:49 daisy kernel: [143521.721245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:29:49 daisy kernel: [143521.721415] hdparm D ffff88000c778300 0 22246 20845 0 0x00000084 Apr 22 09:29:49 daisy kernel: [143521.721421] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:29:49 daisy kernel: [143521.721426] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:29:49 daisy kernel: [143521.721431] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:29:49 daisy kernel: [143521.721436] Call Trace: Apr 22 09:29:49 daisy kernel: [143521.721445] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:29:49 daisy kernel: [143521.721451] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:29:49 daisy kernel: [143521.721455] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:29:49 daisy kernel: [143521.721460] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:29:49 daisy kernel: [143521.721465] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:29:49 daisy kernel: [143521.721469] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:29:49 daisy kernel: [143521.721473] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:29:49 daisy kernel: [143521.721479] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:29:49 daisy kernel: [143521.721483] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:29:49 daisy kernel: [143521.721487] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:29:49 daisy kernel: [143521.721491] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:29:49 daisy kernel: [143521.721495] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:29:49 daisy kernel: [143521.721499] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:30:04 daisy ata_id[22489]: HDIO_GET_IDENTITY failed for '/dev/sdb'
I tried running hdparm -tT /dev/sda, but after waiting 5+ minutes for any command output I cancelled it.
I am rsyncing the data from this system over to another system now, clearly something is wrong, but I can't tell what.
The system is an older AMD Opteron 180 processor (dual core) 4 GB ram, RAID controller with RAID 5 set up with 4x 4TB Western Digital Drives.
I rebooted the system day before yesterday, and that's when the timeout messages started pouring into the log.
when I run tw_cli /c8 show, all four drives say they are ok [root@daisy cron.daily]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model
p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
Logical Volumes appear active: [root@daisy cron.daily]# lvscan ACTIVE '/dev/vg_daisy/lv_root' [10.89 TiB] inherit ACTIVE '/dev/vg_daisy/lv_swap' [3.88 GiB] inherit ACTIVE '/dev/vg_daisy/lv_home' [20.00 GiB] inherit [root@daisy cron.daily]#
[root@daisy cron.daily]# lvmdiskscan /dev/ram0 [ 16.00 MiB] /dev/root [ 10.89 TiB] /dev/ram1 [ 16.00 MiB] /dev/sda1 [ 2.82 TiB] /dev/vg_daisy/lv_swap [ 3.88 GiB] /dev/ram2 [ 16.00 MiB] /dev/vg_daisy/lv_home [ 20.00 GiB] /dev/ram3 [ 16.00 MiB] /dev/sda3 [ 842.87 GiB] /dev/ram4 [ 16.00 MiB] /dev/ram5 [ 16.00 MiB] /dev/ram6 [ 16.00 MiB] /dev/ram7 [ 16.00 MiB] /dev/ram8 [ 16.00 MiB] /dev/ram9 [ 16.00 MiB] /dev/ram10 [ 16.00 MiB] /dev/ram11 [ 16.00 MiB] /dev/ram12 [ 16.00 MiB] /dev/ram13 [ 16.00 MiB] /dev/ram14 [ 16.00 MiB] /dev/ram15 [ 16.00 MiB] /dev/sdb1 [ 1.82 TiB] LVM physical volume /dev/sdc1 [ 500.00 MiB] /dev/sdc2 [ 4.00 TiB] LVM physical volume /dev/sdd1 [ 4.00 TiB] LVM physical volume /dev/sde1 [ 2.91 TiB] LVM physical volume 3 disks 19 partitions 0 LVM physical volume whole disks 4 LVM physical volumes [root@daisy cron.daily]#
grub.conf: [root@daisy grub]# cat grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/mapper/vg_daisy-lv_root # initrd /initrd-[generic-]version.img #boot=/dev/sdb default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title OpenVZ (2.6.32-042stab142.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab142.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab142.1.img title OpenVZ (2.6.32-042stab141.3) root (hd0,0) kernel /vmlinuz-2.6.32-042stab141.3 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab141.3.img title OpenVZ (2.6.32-042stab140.4) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.4 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.4.img title OpenVZ (2.6.32-042stab140.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.1.img title OpenVZ (2.6.32-042stab139.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab139.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab139.1.img title CentOS 6 (2.6.32-754.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-754.el6.x86_64.img
Top is not showing anything out of the ordinary:
[root@daisy grub]#
top - 09:41:57 up 1 day, 16:04, 3 users, load average: 5.89, 5.83, 5.43 Tasks: 369 total, 1 running, 368 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 1.2%sy, 0.0%ni, 25.0%id, 73.5%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3894628k total, 3861280k used, 33348k free, 95608k buffers Swap: 4063228k total, 34888k used, 4028340k free, 3139272k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1266 root 20 0 0 0 0 D 1.0 0.0 12:27.75 flush-253:0 21041 1153 20 0 3188 1840 1012 D 0.7 0.0 0:00.72 imap 21599 97 20 0 5160 1940 1568 S 0.7 0.0 0:01.06 imap-login 22636 root 20 0 15272 1524 964 R 0.7 0.0 0:00.06 top 1977 root 20 0 2096 644 360 S 0.3 0.0 0:27.92 dovecot 22528 97 20 0 5160 2044 1672 S 0.3 0.1 0:00.35 imap-login 22578 1155 20 0 2904 1528 940 D 0.3 0.0 0:00.22 imap 1 root 20 0 19236 268 136 S 0.0 0.0 0:00.68 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.04 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:01.88 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 9 root 20 0 0 0 0 S 0.0 0.0 0:03.17 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 0:00.20 watchdog/1 11 root 20 0 0 0 0 S 0.0 0.0 0:07.23 events/0 12 root 20 0 0 0 0 S 0.0 0.0 0:08.55 events/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/0 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/1 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/1 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 21 root 20 0 0 0 0 S 0.0 0.0 0:00.01 netns 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 24 root 20 0 0 0 0 S 0.0 0.0 0:00.29 sync_supers
This is a company production mail server, and I can't find the solution, I need help, as soon as someone is able, thank you! _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Correct, 3ware 9670SE SATA-II Raid PCIe
[root@daisy dev]# lspci 00:00.0 Memory controller: NVIDIA Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: NVIDIA Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: NVIDIA Corporation CK804 SMBus (rev a2) 00:02.0 USB controller: NVIDIA Corporation CK804 USB Controller (rev a2) 00:02.1 USB controller: NVIDIA Corporation CK804 USB Controller (rev a3) 00:06.0 IDE interface: NVIDIA Corporation CK804 IDE (rev f2) 00:07.0 IDE interface: NVIDIA Corporation CK804 Serial ATA Controller (rev f3) 00:08.0 IDE interface: NVIDIA Corporation CK804 Serial ATA Controller (rev f3) 00:09.0 PCI bridge: NVIDIA Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: NVIDIA Corporation CK804 Ethernet Controller (rev a3) 00:0b.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3) 00:0c.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3) 00:0d.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3) 00:0e.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rage 3 [Rage XL PCI] (rev 27) 04:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 11) 05:00.0 RAID bus controller: 3ware Inc 9650SE SATA-II RAID PCIe (rev 01)
There is no bbu phsyically present:
//daisy/c8> /c8/bbu show Error: (CLI:059) Battery Backup Unit is not present.
//daisy/c8> show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
//daisy/c8>
I had an 8 TB External USB disk plugged into the system, that I had been using for additional space for backups, I was under the impression that sda, sdb, sdc, and sdd were the four disks on the raid controller card, but after unplugging the usb drive when running hdparm I am getting this:
[root@daisy dev]# hdparm -tT /dev/sdb
/dev/sdb: read() hit EOF - device too small BLKGETSIZE failed: Inappropriate ioctl for device BLKFLSBUF failed: Inappropriate ioctl for device [root@daisy dev]#
------------------------------------------------------
the 3ware controller shows all four disks still online with a status of ok:
[root@daisy dev]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
[root@daisy dev]# -----------------------------------
//daisy> /c8/u0 Error: (CLI:039) Invalid unit command syntax.
Unit Info: //daisy> /c8/u0 show
Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB) ------------------------------------------------------------------------ u0 RAID-5 OK - - - 256K 11175.8 u0-0 DISK OK - - p0 - 1677.28 u0-1 DISK OK - - p1 - 1677.28 u0-2 DISK OK - - p2 - 1677.28 u0-3 DISK OK - - p3 - 1677.28 u0/v0 Volume - - - - - 4096 u0/v1 Volume - - - - - 4096 u0/v2 Volume - - - - - 2983.84
I'm still trying to find the exact smartctl command syntax to make it work.
Does this give you what your looking for? On 4/22/2020 10:08 AM, Simon Matter via CentOS wrote:
Hello Everyone,
Since rebooting my Centos 6.10 Openvz server "daisy" yesterday, I am getting horrible system performance. /var/log/messages is full of HDIO_GET_IDENTITY failed for /dev/sdb. The latest entries look like this:
Apr 22 08:51:32 daisy kernel: [141224.655699] CT: 1005: stopped Apr 22 08:55:04 daisy ata_id[21513]: HDIO_GET_IDENTITY failed for '/dev/sdb'
Hi,
You didn't tell us much about your hardware. It seems you're running a 3ware controller in it.
What exactly is /dev/sdb? I don't know 3ware so I can only guess but it looks to me like at least one of your drives is having problems.
Any chance you can check SMART status on the disks or have it do a SMART test? I guess your 3ware controller has to initiate this.
And, what is the status of the BBU?
Looks like "tw_cli /cx/bbu show status" should tell.
Regards, Simon
Apr 22 09:00:05 daisy ata_id[21584]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:05:02 daisy ata_id[21644]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:10:01 daisy ata_id[22282]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:11:49 daisy kernel: [142441.721065] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:11:49 daisy kernel: [142441.721083] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:11:49 daisy kernel: [142441.721093] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:11:49 daisy kernel: [142441.721109] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:11:49 daisy kernel: [142441.721115] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:11:49 daisy kernel: [142441.721121] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:11:49 daisy kernel: [142441.721125] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:11:49 daisy kernel: [142441.721130] Call Trace: Apr 22 09:11:49 daisy kernel: [142441.721139] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:11:49 daisy kernel: [142441.721144] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:11:49 daisy kernel: [142441.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:11:49 daisy kernel: [142441.721155] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721159] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:11:49 daisy kernel: [142441.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:11:49 daisy kernel: [142441.721167] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:11:49 daisy kernel: [142441.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:11:49 daisy kernel: [142441.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:11:49 daisy kernel: [142441.721181] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:11:49 daisy kernel: [142441.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:11:49 daisy kernel: [142441.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:11:49 daisy kernel: [142441.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:13:49 daisy kernel: [142561.721069] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:13:49 daisy kernel: [142561.721087] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:13:49 daisy kernel: [142561.721096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:13:49 daisy kernel: [142561.721112] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:13:49 daisy kernel: [142561.721118] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:13:49 daisy kernel: [142561.721123] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:13:49 daisy kernel: [142561.721128] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:13:49 daisy kernel: [142561.721133] Call Trace: Apr 22 09:13:49 daisy kernel: [142561.721142] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:13:49 daisy kernel: [142561.721148] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:13:49 daisy kernel: [142561.721153] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:13:49 daisy kernel: [142561.721158] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721162] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:13:49 daisy kernel: [142561.721166] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:13:49 daisy kernel: [142561.721170] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:13:49 daisy kernel: [142561.721176] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:13:49 daisy kernel: [142561.721180] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:13:49 daisy kernel: [142561.721184] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:13:49 daisy kernel: [142561.721188] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:13:49 daisy kernel: [142561.721192] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:13:49 daisy kernel: [142561.721196] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:15:06 daisy ata_id[22299]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:15:49 daisy kernel: [142681.721085] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:15:49 daisy kernel: [142681.721104] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:15:49 daisy kernel: [142681.721113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:15:49 daisy kernel: [142681.721129] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:15:49 daisy kernel: [142681.721136] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:15:49 daisy kernel: [142681.721141] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:15:49 daisy kernel: [142681.721146] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:15:49 daisy kernel: [142681.721150] Call Trace: Apr 22 09:15:49 daisy kernel: [142681.721160] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:15:49 daisy kernel: [142681.721166] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:15:49 daisy kernel: [142681.721172] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:15:49 daisy kernel: [142681.721178] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721182] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:15:49 daisy kernel: [142681.721185] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:15:49 daisy kernel: [142681.721190] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:15:49 daisy kernel: [142681.721196] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:15:49 daisy kernel: [142681.721200] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:15:49 daisy kernel: [142681.721204] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:15:49 daisy kernel: [142681.721208] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:15:49 daisy kernel: [142681.721212] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:15:49 daisy kernel: [142681.721217] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:17:49 daisy kernel: [142801.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:17:49 daisy kernel: [142801.721082] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:17:49 daisy kernel: [142801.721091] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:17:49 daisy kernel: [142801.721107] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:17:49 daisy kernel: [142801.721114] ffff88006654bcb8 0000000000000086 ffffffff8114f130 ffff88002821fa40 Apr 22 09:17:49 daisy kernel: [142801.721119] ffff88000004d238 ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0 Apr 22 09:17:49 daisy kernel: [142801.721124] ffff88011a707000 ffff880028321168 000000000001b7ea 0000816be9b3faa2 Apr 22 09:17:49 daisy kernel: [142801.721128] Call Trace: Apr 22 09:17:49 daisy kernel: [142801.721137] [<ffffffff8114f130>] ? sync_page+0x0/0x50 Apr 22 09:17:49 daisy kernel: [142801.721143] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:17:49 daisy kernel: [142801.721149] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:17:49 daisy kernel: [142801.721154] [<ffffffff81067432>] ? check_preempt_curr+0x82/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721158] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:17:49 daisy kernel: [142801.721162] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:17:49 daisy kernel: [142801.721166] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:17:49 daisy kernel: [142801.721172] [<ffffffff811f98d8>] sync_inodes_sb_ub+0xa8/0x1d0 Apr 22 09:17:49 daisy kernel: [142801.721176] [<ffffffff8114fa6f>] ? filemap_fdatawait+0x2f/0x40 Apr 22 09:17:49 daisy kernel: [142801.721180] [<ffffffff81200f85>] __sync_filesystem+0x95/0xa0 Apr 22 09:17:49 daisy kernel: [142801.721184] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:17:49 daisy kernel: [142801.721188] [<ffffffff812016e5>] sys_sync+0x155/0x1a0 Apr 22 09:17:49 daisy kernel: [142801.721192] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:20:01 daisy ata_id[22405]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:21:49 daisy kernel: [143041.721494] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:21:49 daisy kernel: [143041.721512] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:21:49 daisy kernel: [143041.721522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:21:49 daisy kernel: [143041.721691] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:21:49 daisy kernel: [143041.721697] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:21:49 daisy kernel: [143041.721702] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:21:49 daisy kernel: [143041.721706] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:21:49 daisy kernel: [143041.721711] Call Trace: Apr 22 09:21:49 daisy kernel: [143041.721720] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:21:49 daisy kernel: [143041.721726] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:21:49 daisy kernel: [143041.721730] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:21:49 daisy kernel: [143041.721735] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:21:49 daisy kernel: [143041.721739] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:21:49 daisy kernel: [143041.721743] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:21:49 daisy kernel: [143041.721747] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:21:49 daisy kernel: [143041.721753] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:21:49 daisy kernel: [143041.721757] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:21:49 daisy kernel: [143041.721762] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:21:49 daisy kernel: [143041.721765] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:21:49 daisy kernel: [143041.721769] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:21:49 daisy kernel: [143041.721773] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:23:49 daisy kernel: [143161.721064] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:23:49 daisy kernel: [143161.721169] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:23:49 daisy kernel: [143161.721259] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:23:49 daisy kernel: [143161.721430] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:23:49 daisy kernel: [143161.721437] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:23:49 daisy kernel: [143161.721442] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:23:49 daisy kernel: [143161.721447] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:23:49 daisy kernel: [143161.721451] Call Trace: Apr 22 09:23:49 daisy kernel: [143161.721460] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:23:49 daisy kernel: [143161.721466] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:23:49 daisy kernel: [143161.721470] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:23:49 daisy kernel: [143161.721475] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:23:49 daisy kernel: [143161.721479] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:23:49 daisy kernel: [143161.721483] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:23:49 daisy kernel: [143161.721487] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:23:49 daisy kernel: [143161.721493] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:23:49 daisy kernel: [143161.721498] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:23:49 daisy kernel: [143161.721502] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:23:49 daisy kernel: [143161.721506] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:23:49 daisy kernel: [143161.721510] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:23:49 daisy kernel: [143161.721514] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:25:02 daisy ata_id[22445]: HDIO_GET_IDENTITY failed for '/dev/sdb' Apr 22 09:25:49 daisy kernel: [143281.721066] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:25:49 daisy kernel: [143281.721159] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:25:49 daisy kernel: [143281.721244] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:25:49 daisy kernel: [143281.721408] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:25:49 daisy kernel: [143281.721415] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:25:49 daisy kernel: [143281.721420] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:25:49 daisy kernel: [143281.721424] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:25:49 daisy kernel: [143281.721429] Call Trace: Apr 22 09:25:49 daisy kernel: [143281.721438] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:25:49 daisy kernel: [143281.721444] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:25:49 daisy kernel: [143281.721448] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:25:49 daisy kernel: [143281.721453] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:25:49 daisy kernel: [143281.721457] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:25:49 daisy kernel: [143281.721461] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:25:49 daisy kernel: [143281.721465] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:25:49 daisy kernel: [143281.721471] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:25:49 daisy kernel: [143281.721476] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:25:49 daisy kernel: [143281.721480] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:25:49 daisy kernel: [143281.721484] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:25:49 daisy kernel: [143281.721487] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:25:49 daisy kernel: [143281.721492] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:27:49 daisy kernel: [143401.721072] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:27:49 daisy kernel: [143401.721165] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:27:49 daisy kernel: [143401.721253] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:27:49 daisy kernel: [143401.721421] hdparm D ffff88000c778300 0 22246 20845 0 0x00000080 Apr 22 09:27:49 daisy kernel: [143401.721427] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:27:49 daisy kernel: [143401.721432] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:27:49 daisy kernel: [143401.721436] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:27:49 daisy kernel: [143401.721441] Call Trace: Apr 22 09:27:49 daisy kernel: [143401.721450] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:27:49 daisy kernel: [143401.721456] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:27:49 daisy kernel: [143401.721460] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:27:49 daisy kernel: [143401.721465] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:27:49 daisy kernel: [143401.721469] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:27:49 daisy kernel: [143401.721473] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:27:49 daisy kernel: [143401.721477] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:27:49 daisy kernel: [143401.721483] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:27:49 daisy kernel: [143401.721487] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:27:49 daisy kernel: [143401.721492] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:27:49 daisy kernel: [143401.721495] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:27:49 daisy kernel: [143401.721499] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:27:49 daisy kernel: [143401.721503] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:29:49 daisy kernel: [143521.721059] INFO: task hdparm:22246 blocked for more than 120 seconds. Apr 22 09:29:49 daisy kernel: [143521.721158] Not tainted 2.6.32-042stab142.1 #1 Apr 22 09:29:49 daisy kernel: [143521.721245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 22 09:29:49 daisy kernel: [143521.721415] hdparm D ffff88000c778300 0 22246 20845 0 0x00000084 Apr 22 09:29:49 daisy kernel: [143521.721421] ffff88006654bcc8 0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:29:49 daisy kernel: [143521.721426] ffff880028200000 000000001a42f238 ffff8800110101c0 ffff88011a42f200 Apr 22 09:29:49 daisy kernel: [143521.721431] ffff88006654bc68 ffffffff8107bbfe ffff8800110101c0 0000000000000000 Apr 22 09:29:49 daisy kernel: [143521.721436] Call Trace: Apr 22 09:29:49 daisy kernel: [143521.721445] [<ffffffff810098af>] ? __switch_to+0x16f/0x470 Apr 22 09:29:49 daisy kernel: [143521.721451] [<ffffffff8107bbfe>] ? finish_task_switch+0xce/0x120 Apr 22 09:29:49 daisy kernel: [143521.721455] [<ffffffff8107c851>] ? update_curr+0xe1/0x1f0 Apr 22 09:29:49 daisy kernel: [143521.721460] [<ffffffff81566c55>] schedule_timeout+0x215/0x2f0 Apr 22 09:29:49 daisy kernel: [143521.721465] [<ffffffff815669b4>] wait_for_completion+0xe4/0x120 Apr 22 09:29:49 daisy kernel: [143521.721469] [<ffffffff81071ce0>] ? default_wake_function+0x0/0x20 Apr 22 09:29:49 daisy kernel: [143521.721473] [<ffffffff815694db>] ? _spin_unlock_bh+0x1b/0x20 Apr 22 09:29:49 daisy kernel: [143521.721479] [<ffffffff811f9773>] writeback_inodes_sb_nr_ub+0x83/0xb0 Apr 22 09:29:49 daisy kernel: [143521.721483] [<ffffffff811f9806>] writeback_inodes_sb_ub+0x46/0x50 Apr 22 09:29:49 daisy kernel: [143521.721487] [<ffffffff81200f38>] __sync_filesystem+0x48/0xa0 Apr 22 09:29:49 daisy kernel: [143521.721491] [<ffffffff8120151d>] sync_filesystems+0x30d/0x350 Apr 22 09:29:49 daisy kernel: [143521.721495] [<ffffffff812016d8>] sys_sync+0x148/0x1a0 Apr 22 09:29:49 daisy kernel: [143521.721499] [<ffffffff81571424>] system_call_fastpath+0x22/0x3a Apr 22 09:30:04 daisy ata_id[22489]: HDIO_GET_IDENTITY failed for '/dev/sdb'
I tried running hdparm -tT /dev/sda, but after waiting 5+ minutes for any command output I cancelled it.
I am rsyncing the data from this system over to another system now, clearly something is wrong, but I can't tell what.
The system is an older AMD Opteron 180 processor (dual core) 4 GB ram, RAID controller with RAID 5 set up with 4x 4TB Western Digital Drives.
I rebooted the system day before yesterday, and that's when the timeout messages started pouring into the log.
when I run tw_cli /c8 show, all four drives say they are ok [root@daisy cron.daily]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model
p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
Logical Volumes appear active: [root@daisy cron.daily]# lvscan ACTIVE '/dev/vg_daisy/lv_root' [10.89 TiB] inherit ACTIVE '/dev/vg_daisy/lv_swap' [3.88 GiB] inherit ACTIVE '/dev/vg_daisy/lv_home' [20.00 GiB] inherit [root@daisy cron.daily]#
[root@daisy cron.daily]# lvmdiskscan /dev/ram0 [ 16.00 MiB] /dev/root [ 10.89 TiB] /dev/ram1 [ 16.00 MiB] /dev/sda1 [ 2.82 TiB] /dev/vg_daisy/lv_swap [ 3.88 GiB] /dev/ram2 [ 16.00 MiB] /dev/vg_daisy/lv_home [ 20.00 GiB] /dev/ram3 [ 16.00 MiB] /dev/sda3 [ 842.87 GiB] /dev/ram4 [ 16.00 MiB] /dev/ram5 [ 16.00 MiB] /dev/ram6 [ 16.00 MiB] /dev/ram7 [ 16.00 MiB] /dev/ram8 [ 16.00 MiB] /dev/ram9 [ 16.00 MiB] /dev/ram10 [ 16.00 MiB] /dev/ram11 [ 16.00 MiB] /dev/ram12 [ 16.00 MiB] /dev/ram13 [ 16.00 MiB] /dev/ram14 [ 16.00 MiB] /dev/ram15 [ 16.00 MiB] /dev/sdb1 [ 1.82 TiB] LVM physical volume /dev/sdc1 [ 500.00 MiB] /dev/sdc2 [ 4.00 TiB] LVM physical volume /dev/sdd1 [ 4.00 TiB] LVM physical volume /dev/sde1 [ 2.91 TiB] LVM physical volume 3 disks 19 partitions 0 LVM physical volume whole disks 4 LVM physical volumes [root@daisy cron.daily]#
grub.conf: [root@daisy grub]# cat grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/mapper/vg_daisy-lv_root # initrd /initrd-[generic-]version.img #boot=/dev/sdb default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title OpenVZ (2.6.32-042stab142.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab142.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab142.1.img title OpenVZ (2.6.32-042stab141.3) root (hd0,0) kernel /vmlinuz-2.6.32-042stab141.3 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab141.3.img title OpenVZ (2.6.32-042stab140.4) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.4 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.4.img title OpenVZ (2.6.32-042stab140.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab140.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab140.1.img title OpenVZ (2.6.32-042stab139.1) root (hd0,0) kernel /vmlinuz-2.6.32-042stab139.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-042stab139.1.img title CentOS 6 (2.6.32-754.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet initrd /initramfs-2.6.32-754.el6.x86_64.img
Top is not showing anything out of the ordinary:
[root@daisy grub]#
top - 09:41:57 up 1 day, 16:04, 3 users, load average: 5.89, 5.83, 5.43 Tasks: 369 total, 1 running, 368 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 1.2%sy, 0.0%ni, 25.0%id, 73.5%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3894628k total, 3861280k used, 33348k free, 95608k buffers Swap: 4063228k total, 34888k used, 4028340k free, 3139272k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1266 root 20 0 0 0 0 D 1.0 0.0 12:27.75 flush-253:0 21041 1153 20 0 3188 1840 1012 D 0.7 0.0 0:00.72 imap 21599 97 20 0 5160 1940 1568 S 0.7 0.0 0:01.06 imap-login 22636 root 20 0 15272 1524 964 R 0.7 0.0 0:00.06 top 1977 root 20 0 2096 644 360 S 0.3 0.0 0:27.92 dovecot 22528 97 20 0 5160 2044 1672 S 0.3 0.1 0:00.35 imap-login 22578 1155 20 0 2904 1528 940 D 0.3 0.0 0:00.22 imap 1 root 20 0 19236 268 136 S 0.0 0.0 0:00.68 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.04 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:01.88 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.19 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 9 root 20 0 0 0 0 S 0.0 0.0 0:03.17 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 0:00.20 watchdog/1 11 root 20 0 0 0 0 S 0.0 0.0 0:07.23 events/0 12 root 20 0 0 0 0 S 0.0 0.0 0:08.55 events/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/0 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events/1 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_long/1 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 events_power_ef 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 21 root 20 0 0 0 0 S 0.0 0.0 0:00.01 netns 22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 24 root 20 0 0 0 0 S 0.0 0.0 0:00.29 sync_supers
This is a company production mail server, and I can't find the solution, I need help, as soon as someone is able, thank you! _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On 4/22/20 8:53 AM, Christopher Wensink wrote:
I had an 8 TB External USB disk plugged into the system, that I had been using for additional space for backups, I was under the impression that sda, sdb, sdc, and sdd were the four disks on the raid controller card,
Not exactly. If you have a RAID5 array, then you have one volume spread across the physical disks. You can then divide that into smaller virtual disks, each of which is still spread across disks. The OS doesn't see the four component disks, directly.
but after unplugging the usb drive when running hdparm I am getting this:
[root@daisy dev]# hdparm -tT /dev/sdb
/dev/sdb: read() hit EOF - device too small
sdb was the disk that appeared in the kernel errors, so I'd imagine that you've fixed the problem by removing the USB drive enclosure.
I'm still trying to find the exact smartctl command syntax to make it work.
https://www.cyberciti.biz/faq/unix-linux-freebsd-3w-9xxx-smartctl-check-hard...
Smartctl tests have passed:
[root@daisy dev]# smartctl -H -d 3ware,0 /dev/twa0 smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-042stab142.1] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED -------------------
The problem may have been the USB disk the whole time, and not one of the internal disks, it's surprising that the performance of the entire system would suffer that much from a faulting external USB drive.
To monitor the system and ensure that was the root cause, other than hdparm, what other suggested performance tests should I run to monitor the performance of the system as a whole? What is everyone's top rated performance monitoring commands / apps that can be dumped into cron jobs or logwatch, etc?
Chris
On 4/22/2020 11:00 AM, Gordon Messmer wrote:
On 4/22/20 8:53 AM, Christopher Wensink wrote:
I had an 8 TB External USB disk plugged into the system, that I had been using for additional space for backups, I was under the impression that sda, sdb, sdc, and sdd were the four disks on the raid controller card,
Not exactly. If you have a RAID5 array, then you have one volume spread across the physical disks. You can then divide that into smaller virtual disks, each of which is still spread across disks. The OS doesn't see the four component disks, directly.
but after unplugging the usb drive when running hdparm I am getting this:
[root@daisy dev]# hdparm -tT /dev/sdb
/dev/sdb: read() hit EOF - device too small
sdb was the disk that appeared in the kernel errors, so I'd imagine that you've fixed the problem by removing the USB drive enclosure.
I'm still trying to find the exact smartctl command syntax to make it work.
https://www.cyberciti.biz/faq/unix-linux-freebsd-3w-9xxx-smartctl-check-hard...
CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
On Wed, 22 Apr 2020 10:53:08 -0500 Christopher Wensink cwensink@five-star-plastics.com wrote:
Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB)
u0 RAID-5 OK - - - 256K 11175.8 u0-0 DISK OK - - p0 - 1677.28 u0-1 DISK OK - - p1 - 1677.28 u0-2 DISK OK - - p2 - 1677.28 u0-3 DISK OK - - p3 - 1677.28 u0/v0 Volume - - - - - 4096 u0/v1 Volume - - - - - 4096 u0/v2 Volume - - - - - 2983.84
This read: You have 4 physical drives configured as one raid5 array. The array has three volumes (v0, v1, v2). These three volumes will be visible as three scsi devices in Linux.
You can use for example lsblk or lsscsi to list them.
Also, you may want to look physically for a battery backup unit (bbu). The software said there isn't one and that certainly could be a performance problem (as I would assume the controller turns off write-back caching).
/Peter
Christopher,
[root@daisy cron.daily]# tw_cli /c8 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
u0 RAID-5 OK - - 256K 11175.8 Ri ON
VPort Status Unit Size Type Phy Encl-Slot Model
p0 OK u0 3.63 TB SATA 0 - WDC WD4005FZBX-00K5 p1 OK u0 3.63 TB SATA 1 - WDC WD4005FZBX-00K5 p2 OK u0 3.63 TB SATA 2 - WDC WD4005FZBX-00K5 p3 OK u0 3.63 TB SATA 3 - WDC WD4005FZBX-00K5
you are running your RAID-controller without BBU and without write caching. (see the Ri statement)
I get
tw_cli /c0 show
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-6 OK - - 256K 16763.7 RiW ON
VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 2.73 TB SATA 0 - WDC WD30EURS-63SPKY0 p1 OK u0 2.73 TB SATA 1 - WDC WD30EURS-63R8UY0 p2 OK u0 2.73 TB SATA 2 - WDC WD30EURS-63SPKY0 p3 OK u0 2.73 TB SATA 3 - WDC WD30EURS-63SPKY0 p4 OK u0 2.73 TB SATA 4 - WDC WD30EURS-63SPKY0 p5 OK u0 2.73 TB SATA 5 - WDC WD30EURS-63SPKY0 p6 OK u0 2.73 TB SATA 6 - WDC WD30EURS-63SPKY0 p7 OK u0 2.73 TB SATA 7 - WDC WD30EURS-63SPKY0
Name OnlineState BBUReady Status Volt Temp Hours LastCapTest --------------------------------------------------------------------------- bbu On Yes OK OK OK 197 31-Dec-2019
with the same type of controllers. When I did the initial installation, I found that the write caching makes a VERY big difference in my mailserver application. However, you might have get problems finding a matching BBU these days if your controller is not very recent. And depending on where you are located and how good your power supply is, you don't want to activate write caching without a BBU.
cheers, --- Michael Schumacher