[CentOS] HDD badblocks

Sun Jan 17 17:46:51 UTC 2016
Brandon Vincent <Brandon.Vincent at asu.edu>

On Sun, Jan 17, 2016 at 10:05 AM, Matt Garman <matthew.garman at gmail.com> wrote:
> I'm not sure what's going on with your drive. But if it were mine, I'd want
> to replace it. If there are issues, that long smart check ought to turn up
> something,  and in my experience, that's enough for a manufacturer to do a
> warranty replacement.

I agree with Matt. Go ahead and run a few of the S.M.A.R.T. tests. I
can almost guarantee based off of your description of your problem
that they will fail.

badblocks(8) is a very antiquated tool. Almost every hard drive has a
few bad sectors from the factory. Very old hard drives used to have a
list of the bad sectors printed on the front of the label. When you
first created a filesystem you had to enter all of the bad sectors
from the label so that the filesystem wouldn't store data there. Years
later, more bad sectors would form and you could enter them into the
filesystem by discovering them using a tool like badblocks(8).

Today, drives do all of this work automatically. The manufacturer of a
hard drive will scan the entire surface and write the bad sectors into
a section of the hard drive's electronics known as the P-list. The
controller on the drive will automatically remap these sectors to a
small area of unused sectors set aside for this very purpose. Later if
more bad sectors form, hard drives when they see a bad sector will
enter it into a list known as the G-list and then remap this sector to
other sectors in the unused area of the drive I mentioned earlier.

Basically under normal conditions, the end user should NEVER see bad
sectors from their perspective. If badblocks(8) is reporting bad
sectors, it is very likely that enough bad sectors have formed to the
point where the unused reserved sectors is depleted of replacement
sectors. While in theory you could run badblocks(8) and pass it to the
filesystem, I can ensure you that the growth of bad sectors at this
point has reached a point in which it will continue.

I'd stop using that hard drive, pull any important data, and then
proceed to run S.M.A.R.T. tests so if the drive is under warranty you
can have it replaced.

Brandon Vincent