[CentOS] OT - Offline uncorrectable sectors

Mon Aug 25 23:13:46 UTC 2008
William L. Maltby <CentOS4Bill at triad.rr.com>

On Mon, 2008-08-25 at 15:36 -0700, Nifty Cluster Mitch wrote:
> On Mon, Aug 25, 2008 at 03:43:18PM -0400, William L. Maltby wrote:
> > On Mon, 2008-08-25 at 12:03 -0700, Nifty Cluster Mitch wrote:
> > > On Mon, Aug 25, 2008 at 07:24:24AM -0400, William L. Maltby wrote:
> > > > 
> > > ><snip>
> > 
> > > > (potentially) lost on an existing file system. It's best utility is at
> > > > FS creation and check time. It also has use if you can un-mount the FS
> > > > (ignoring the "force" capability provided) but cannot take the system
> > > > down to run manufacturer-specific diagnostic and repair software.
> > > 
> > > It might be interesting to add a "catch 22" story.
> > > 
> > > I once added -c  flags to /fsckoptions and "touch"ed /forcefsck.
> > > I had to take the disk to the lab and fix it on a bench system. 
> > 
> > YOIKS! Any explanation why such a reliable process would cause such a
> > result? Was it a long time ago with a buggy e2fsck maybe? Did you mean
> > to say you added the "-f" flag and the FS was mounted and active at the
> > time? Is it just one of those "Mysteries of the Universe"? I hate those!
> 
> The removal of /forcefsck would never happen when badblocks was run.
> Something wonkey perhaps because I did have a disk with defects..
> 
> Might be worth a retry next time I need to clean and reload a machine
> but I do not know how to reproduct the disk hardware issue.
> 
> Gone are the days where disk controllers gave you the ability
> to 'expose' defects.

I don't have an available "smart" drive here at home, but I do have some
older stuff. I think we can "emulate" defects by defining a partition
that runs a few sectors beyond the end of the HD. Then mke2fs giving the
-c -c and a manually specified size that includes the phantom sectors.

When I get time (won't be RSN) I'll do both a mke2fs test and then an
e2fsck test. What I don't know is if notification of "beyond media end"
is sent by hardware and caught by drivers or if drivers just catch an
error and a bad block (sector) is presumed, to be logged and avoided.
ISTR (on SCSI anyway) that read past media end was handled. But, this
ain't SCSI! 8-)

If someone has a setup that makes this a quick and easy test to run
sooner than I'll be able to, that would be "peachy".

> <snip>

-- 
Bill