[CentOS] Errors on an SSD drive

Fri Aug 11 19:28:00 UTC 2017
Warren Young <warren at etr-usa.com>

On Aug 11, 2017, at 1:07 PM, Robert Nichols <rnicholsNOSPAM at comcast.net> wrote:

>> Yeah he'd want to do an fsck -f and see if repairs are madestem.
> 
> fsck checks filesystem metadata, not the content of files.

Chris might have been thinking of fsck -c or -k, which do various sorts of badblocks scans.

That’s still a poor alternative to strong data checksumming and Merkle tree structured filesystems, of course.

> LVM certainly makes the procedure harder. Figuring out what filesystem
> block corresponds to that LBA is still possible, but you have to examine
> the LV layout in /etc/lvm/backup/ and learn more than you probably wanted
> to know about LVM.

Linux kernel 4.8 added a feature called reverse mapping which is intended to solve this problem.

In principle, this will let you get a list of files that are known to be corrupted due to errors at the block layer, then fix it by removing or overwriting those files.  The block layer, DM, LVM2, and filesystem layers will then be able to understand that those blocks are no longer corrupt, therefore the filesystem is fine, as are all the possible layers in between.

This understanding is based on a question I asked and had answered on the Stratis-Docs GitHub issue tracker:

    https://github.com/stratis-storage/stratis-docs/issues/53

We’ll see how well it works in practice.  It is certainly possible in principle: ZFS does this today.