[CentOS] Disaster recovery recommendations

Sat Oct 31 20:30:54 UTC 2015
Mark LaPierre <marklapier at gmail.com>

On 10/31/15 15:17, Valeri Galtsev wrote:
> 
> On Fri, October 30, 2015 9:31 pm, Mark LaPierre wrote:
>> On 10/30/15 17:30, Max Pyziur wrote:
>>>
>>> Greetings,
>>>
>>> I have three drives; they are all SATA Seagate Barracudas; two are
>>> 500GB; the third is a 2TB.
>>>
>>> I don't have a clear reason why they have failed (possibly due to a
>>> deep, off-brand, flakey mobo; but it's still inconclusive, but I would
>>> like to find a disaster recovery service that can hopefully recover the
>>> data.
>>>
>>> Much thanks for any and all suggestions,
>>>
>>> Max Pyziur
>>> pyz at brama.com
>>
>> If you can get them mounted on a different machine, other than the one
>> with the problem mother board, then I suggest giving SpinRite a try.
>>
>> https://www.grc.com/sr/spinrite.htm
> 
> I listened to guy's video. Pretty much sounds like what command line utility
> 
> badblocks
> 
> does. The only viable I hear is its latest addition when this utility
> flips all bits and writes into the same location. In fact it is anything
> (containing both 0's and 1's) that is to be written to the sector, then on
> write the drive firmware kicks in as the drive itself on write operation
> reads written sector and compared to what was sent to it and if it differs
> it labels sector, or rather block I used wrong term just after this guy as
> I was listening while typing. Anyway this forces discovery and
> re-allocation of bad blocks. Otherwise bad blocks are discovered on some
> read operation, if CRC (cyclic redundancy check sum) on read doesn't
> match, the firmware reads the block many times and superimposes the read
> results, if it finally gets CRC match it happily writes what it came with
> to the bad block relocation area, and adds block to bad block
> re-allocation table. After some number of reads if firmware doesn't come
> up with CRC match it gives up, writes whatever superimposed data is. So
> these data are under suspicion as even CRC match doesn't mean the data is
> correct. This is why there are filesytems (ZFS to name one) that store
> really sophisticated checksums for each of files.
> 
> Two things can be mentioned here.
> 
> 1. If you notice that sometimes the machine (I/O actually) freezes on
> access of some file(s), it most likely means the drive firmware is
> struggling to do its magic on recovery of content and re-allocation of
> newly discovered bad blocks. Time to check and maybe replace the drive.
> 
> 2. Hardware RAIDs (and probably software RAIDs - someone chime in, I'm
> staying away from software RAIDs) have the ability to schedule "verify"
> task. This basically goes over all sectors (or blocks) of all drives thus:
> a. forcing drive firmware to discover newly developed bad blocks; b. as
> drives when working on badblock will often time out, then RAID firmware
> will kick this drive out, and will start rebuilding RAID, thus re-writing
> content of bad block on the drive developed bad block. In this case the
> information comes from good drives, thus less likely to be corrupted. What
> I described is best case scenario, not always drive will time out... so
> even hardware RAIDS are prone to actual data corruption, Bottom line, it
> is good to migrate to something like ZFS.
> 
> Thanks.
> Valeri
> 
>>
>> It's inexpensive which makes it a low risk and not much of a loss if it
>> doesn't work.
>>
>> Also consider this a lesson learned.  The cost of a second low capacity
>> machine, including the electric bill to run it, is insignificant
>> compared to paying for data recovery.
>>
>> http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=7841915&Sku=J001-10169
>>
>> If you insist on keeping personal control of your data, like I do, then
>> that is the best way to go about it.  Use the second machine as your
>> backup.  Set it up as a NAS device and use rsync to keep your data
>> backed up.  If you're paranoid you could even locate the old clunker off
>> site at a family/friend's home and connect to it using ssh over the
>> internet.
>>
>> Your other option is to use a cloud storage service of some kind.  Be
>> sure to encrypt anything you store on the cloud on your machine first,
>> before you send it to the cloud, so that your data will be secure even
>> if someone hacks your cloud service.  There's another drawback to using
>> a cloud as your backup.  The risk is small, but you do have to realize
>> that the cloud could blow away along with your data.  It's happened
>> before.
>>
>> --
>>     _
>>    °v°
>>   /(_)\
>>    ^ ^  Mark LaPierre
>> Registered Linux user No #267004
>> https://linuxcounter.net/
>> ****
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>>
> 
> 
> ++++++++++++++++++++++++++++++++++++++++
> Valeri Galtsev
> Sr System Administrator
> Department of Astronomy and Astrophysics
> Kavli Institute for Cosmological Physics
> University of Chicago
> Phone: 773-702-4247
> ++++++++++++++++++++++++++++++++++++++++
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
> 

Hey Valeri,

What you say is true and should be considered when he rebuilds his system.

The point of my post was to suggest a way for the OP to recover his data
at a reasonable cost using Spinrite.

One point you may be confused with is that Spinrite does not care what
file system you have on your disk.  Spinrite does not mount the file
system.  It access the disk storage media one sector at a time using the
actual drive hardware/firmware to read the data from each sector.  If it
does not succeed in reading the sector it keeps trying using various
methods until it gets a read or until it is satisfied that the sector is
unreadable.

When it gets a read it writes it back to the center of the track where
it's supposed to be and checks to be sure that it worked by reading it
back again.

As Spinrite progresses across the storage media the drive firmware
manages the marking of truly unrecoverable sectors as bad and the other
sectors as good.

-- 
    _
   °v°
  /(_)\
   ^ ^  Mark LaPierre
Registered Linux user No #267004
https://linuxcounter.net/
****