On Tue, Oct 6, 2009 at 11:25 AM, Stewart Williams lists-at-pinkyboots.co.uk wrote: > I am fairly certain that this disk is failing in my server, and I am > replacing it straight away anyway. Good idea. Looks like it's dying. > Oct 5 08:34:47 server1 kernel: res > 41/40:00:40:1f:71/a0:00:14:00:00/00 Emask 0x409 (media error) <F> > Oct 5 08:34:47 server1 kernel: ata1.00: status: { DRDY ERR } > Oct 5 08:34:47 server1 kernel: ata1.00: error: { UNC } > Oct 5 08:34:47 server1 kernel: ata1.00: cmd I've yet to see a media error that wasn't from a dying drive. > Oct 5 08:35:13 server1 kernel: SCSI device sda: drive cache: write through Nice. I didn't realize that Linux would disable the unsafe-but-faster write back cache for a slower-but-safer write through cache when errors were detected. Or perhaps your drive is just a bit fancier than mine. :-) > I am also getting these errors every 30 minutes: > > Oct 5 06:22:06 server1 smartd[3118]: Device: /dev/sda, 12 Offline > uncorrectable sectors Not good. > Below is the smart selftest log: > SMART overall-health self-assessment test result: PASSED I always get a kick out of that, especially given.. > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed: read failure 90% 5689 > 526673 > # 2 Extended offline Completed: read failure 90% 5685 > 526673 Two self-tests show read failures yet the status is PASSED? Ridiculous... -- Steve