On Wed, 2 Aug 2006, Alfred von Campe wrote:
Maybe the disk is dying? Did you run smartd (it requires -d ata for SATA disks; this option needs to be put in smartd.conf)?
It's a brand new disk (well, less than three months old), and it pretty much did it from the get-go (as did the previous disk). I've replaced the SATA cable and updated the system BIOS (it's a Lenovo ThinkCentre M51, and I am using the on-board SATA controller) as previously suggested. I was hoping that the errors logged with syslog would help uncover the root cause. It just so happens that the first time after I configure syslog to log to another system, the disk becomes unbootable.
The error messages could also indicate bad cables.
I already replaced the cable, and this is is an intermittent error.
I would boot from the CentOS 4.3 Live-CD, and take a look at the disk with smartctl. If the disk is indeed dying, I'd try to save its contents to a fresh disk, using ddrescue. Unfortunately there are 2 programs with this name (http://www.garloff.de/kurt/linux/ddrescue/ and http://www.gnu.org/software/ddrescue/ddrescue.html); I have very good results with the latter - don't know if it's on the LiveCD (if not, it should!).
Great idea, booting from the CD as I type this. I had tried booting the install CD in rescue mode, but that resulted in a kernel panic when it tried to mount the disk. Let's hope I have more luck with the LiveCD.
I would also recommend running a long SMART self-test on the drive. If you capture the SMART attributes before and after the test, it is actually pretty easy to locate the source of the problem (e.g., host controller vs disk disk vs. bad sector ) by comparing the SMART attributes that were captured. If you want additional details, check out the following article:
http://prefetch.net/articles/diskdrives.smart.html
Thanks, - Ryan -- UNIX Administrator http://prefetch.net