[CentOS] CentOS 4.3 occasionally locking up accessing IDE drive

Sun Apr 2 09:54:44 UTC 2006
Leo Arnts <leo at arnts.org>

Hi Bart,

Here the same problems with an Asus P5ND2 SLI motherboard and 2 ATA Maxtor
6L200P0 200Gb hard drives in raid 0+1.

The disc also checks ok even with the Maxtor tools. Guessing a linux problem
maybe one of the ata drivers ? I have seen the problem with several kernel's
yust now running 2.6.9-34.ELsmp-CUSTOM

-----Oorspronkelijk bericht-----
Van: centos-bounces at centos.org [mailto:centos-bounces at centos.org] Namens
Bart Schaefer
Verzonden: zaterdag 1 april 2006 21:17
Aan: CentOS mailing list
Onderwerp: [CentOS] CentOS 4.3 occasionally locking up accessing IDE drive

For those who haven't seen my several previous postings about problems
with this (now not quite so) new PC, I have an ASUS P5N32-SLI Deluxe
motherboard.  The boot drive and primary filesystems are on an SATA
disk and I'm having no problem with that.  However, I recently plugged
in a couple of IDE drives salvaged from my old PCs and I'm running
into trouble with one of those.

The drive in question is a 20GB Maxtor 92049U6.  It had an old RH5.2
ext2 filesystem on it when I first plugged it in, from which I tried
to recover some data to back up to CD.  Mostly this worked, but I
started encountering read errors accessing some files so I unmounted
the partition and started a fsck on it.  At some point during the fsck
(I was off doing something else on another workspace at the time), the
system locked up hard, leaving the disk activity LED lit.  I had to
reset the PC.

So at that point I booted single-user and ran the fsck from there.  It
completed successfully after fixing a number of problems.  I continued
into multi-user mode, finished doing my backups, repartitioned the
drive, and started "mkfs -t ext3 -c" on the larger partition, to check
for bad blocks.  Again at some point part way through the mkfs, the
system locked up.

Back to single user mode, run the "mkfs", everything finishes fine. 
Back to multi-user mode, start to copy some large files onto the
drive.  MD5 sums fail to match for some of the copied files. 
Unmounted and started up "fsck -y".  This succeeded, after fixing a
number of errors, so (at this point just as a test case) I re-copied
the files with bad MD5s.  Some of these came through OK this time,
others still did not.  I decided perhaps this meant there were still
bad blocks on the drive that a read-only test was not finding.

You'd think I'd have learned, but encouraged by the success of the
previous fsck I optimistically started up another "fsck -c -c -y" on
the suspect partition, and this time I waited around to watch it. 
About 1.6GB into the 16GB partition, the system locked up again.

This time I booted into a hard disk diagnostic program instead of into
CentOS.  After running overnight last night, a non-destructive
read-write surface-scan reported no problems with the drive.  This
leads me to suspect that the problem is with linux, but I don't know
how to proceed with diagnosing it.  Suggestions would be appreciated.
CentOS mailing list
CentOS at centos.org