Using CentOS 7 to attempt recovery of failed disk

List overview All Threads
Download

newer

older

CentOS 7 (2009) in progress

JFS for CentOS 7

Jerry Geis

26 Sep 2020 26 Sep '20

1:05 p.m.

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops incrementing the save image any more.

Now I'm trying ddrescue and it also stops about the same point

Thoughts on how to continue past that point ? Thanks,

Jerry

Show replies by date

Valeri Galtsev

26 Sep 26 Sep

1:15 p.m.

...

On Sep 26, 2020, at 8:05 AM, Jerry Geis jerry.geis@gmail.com wrote:

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops incrementing the save image any more.

did you try

dd conv=noerror …

this flag makes dd not stop on input error. Whatever is irrecoverable is irrecoverable, but this way you will get stuff beyond failure point.

Valeri

...

Now I'm trying ddrescue and it also stops about the same point

Thoughts on how to continue past that point ? Thanks,

Jerry _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Fred

2:01 p.m.

Also, does ddrescue "stop" (as in quit) or is it just stuck there spending a lot of time trying to read one or more bad spots? it is intended to keep trying until it gets something, or gives up and skips to the next track/sector/whatever. If you let it go for a long time (overnight???) does it proceed ?

On Sat, Sep 26, 2020 at 9:15 AM Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...

...
On Sep 26, 2020, at 8:05 AM, Jerry Geis jerry.geis@gmail.com wrote:

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops incrementing the save image any more.

did you try

dd conv=noerror …

this flag makes dd not stop on input error. Whatever is irrecoverable is irrecoverable, but this way you will get stuff beyond failure point.

Valeri

...
Now I'm trying ddrescue and it also stops about the same point

Thoughts on how to continue past that point ? Thanks,

Jerry _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

tony＠softins.co.uk

27 Sep 27 Sep

12:21 p.m.

In article E02FA554-9D6D-4E7D-8A78-5FBDE1DE939D@kicp.uchicago.edu, Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...

...
On Sep 26, 2020, at 8:05 AM, Jerry Geis jerry.geis@gmail.com wrote:

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops incrementing the save image any more.

did you try

dd conv=noerror …

this flag makes dd not stop on input error. Whatever is irrecoverable is irrecoverable, but this way you will get stuff beyond failure point.

You need conv=noerror,sync so that unreadable sectors get replaced by zeros instead of not being written out at all. Without sync, the filesystem geometry on the destination image will be wrong after the first error.

You also need bs=4096 so that ONLY the bad sector(s) get zeroed, and not the surrounding ones. If you have, say, bs=1M, then you will get a megabyte of zeros if any block within that megabyte is bad.

I'm speaking from recent experience!

Cheers Tony

-- Tony Mountifield Work: tony@softins.co.uk - http://www.softins.co.uk Play: tony@mountifield.org - http://tony.mountifield.org

Erick Perez - Quadrian Enterprises

4:27 p.m.

@tonymountifield Does this still hold true? https://superuser.com/a/1075837

On Sun, Sep 27, 2020 at 7:21 AM Tony Mountifield tony@softins.co.uk wrote:

...

In article E02FA554-9D6D-4E7D-8A78-5FBDE1DE939D@kicp.uchicago.edu, Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...
...
On Sep 26, 2020, at 8:05 AM, Jerry Geis jerry.geis@gmail.com wrote:

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops

incrementing

...
...
the save image any more.

did you try

dd conv=noerror …

this flag makes dd not stop on input error. Whatever is irrecoverable is

irrecoverable, but this way you will get stuff

...
beyond failure point.

You need conv=noerror,sync so that unreadable sectors get replaced by zeros instead of not being written out at all. Without sync, the filesystem geometry on the destination image will be wrong after the first error.

You also need bs=4096 so that ONLY the bad sector(s) get zeroed, and not the surrounding ones. If you have, say, bs=1M, then you will get a megabyte of zeros if any block within that megabyte is bad.

I'm speaking from recent experience!

Cheers Tony

-- Tony Mountifield Work: tony@softins.co.uk - http://www.softins.co.uk Play: tony@mountifield.org - http://tony.mountifield.org _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

-- --------------------- Erick Perez Quadrian Enterprises S.A. - Panama, Republica de Panama Skype chat: eaperezh WhatsApp IM: +507-6675-5083 ---------------------

tony＠softins.co.uk

28 Sep 28 Sep

4:37 p.m.

In article CACXMG+s-KcfMV_eFRqm12asaQ8rDABL123bD+ur8jwNVLyQXjw@mail.gmail.com, Erick Perez - Quadrian Enterprises eperez@quadrianweb.com wrote:

...

@tonymountifield Does this still hold true? https://superuser.com/a/1075837

It wouldn't surprise me. What I take away from those tests is that it is indeed important to use a bs= setting that corresponds to the disk physical block size, which is why I said to use bs=4096.

When I used "conv=noerror,sync bs=4096" I got an image of the correct size. That seems to correspond with what is said in the comment you linked to.

Cheers Tony

...

On Sun, Sep 27, 2020 at 7:21 AM Tony Mountifield tony@softins.co.uk wrote:

...
In article E02FA554-9D6D-4E7D-8A78-5FBDE1DE939D@kicp.uchicago.edu, Valeri Galtsev galtsev@kicp.uchicago.edu wrote:

...
...
On Sep 26, 2020, at 8:05 AM, Jerry Geis jerry.geis@gmail.com wrote:

I have a disk that is flagging errors, attempting to rescue the data.

I tried dd first - if gets about 117G of 320G disk and stops

incrementing

...
...
the save image any more.

did you try

dd conv=noerror …

this flag makes dd not stop on input error. Whatever is irrecoverable is

irrecoverable, but this way you will get stuff

...
beyond failure point.

You need conv=noerror,sync so that unreadable sectors get replaced by zeros instead of not being written out at all. Without sync, the filesystem geometry on the destination image will be wrong after the first error.

You also need bs=4096 so that ONLY the bad sector(s) get zeroed, and not the surrounding ones. If you have, say, bs=1M, then you will get a megabyte of zeros if any block within that megabyte is bad.

I'm speaking from recent experience!

Cheers Tony

-- Tony Mountifield Work: tony@softins.co.uk - http://www.softins.co.uk Play: tony@mountifield.org - http://tony.mountifield.org _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

--

Erick Perez Quadrian Enterprises S.A. - Panama, Republica de Panama Skype chat: eaperezh WhatsApp IM: +507-6675-5083

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

-- Tony Mountifield Work: tony@softins.co.uk - http://www.softins.co.uk Play: tony@mountifield.org - http://tony.mountifield.org

Jerry Geis

26 Sep 26 Sep

5:40 p.m.

Hello

I did try the "dd conv=noerror …" The ddrescue - doesnt stop - it just doesnt "continue" past a certain point. Somewhere around the 117G mark - it just doesnt go past that . (same with dd, gets to 117G and just doesnt continue. I have let the dd run all night - did not go past the 117G.

Thanks for any suggestions.

Jerry

Fred

7:26 p.m.

Well, I'm not a noted expert on ddrescue, but my limited experience tells me that when it hits bad spots (or a big cluster of them) it can go very slowly as it tries multiple times to read each sector (or track, I'm not sure which, in this case). It keeps a list of bad spots and goes back at the end to try again to read something from them. Of course, if you've had, eg. a head crash, there's probably nothing there to read.

On Sat, Sep 26, 2020 at 1:41 PM Jerry Geis jerry.geis@gmail.com wrote:

...

Hello

I did try the "dd conv=noerror …" The ddrescue - doesnt stop - it just doesnt "continue" past a certain point. Somewhere around the 117G mark - it just doesnt go past that . (same with dd, gets to 117G and just doesnt continue. I have let the dd run all night - did not go past the 117G.

Thanks for any suggestions.

Jerry _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Erick Perez - Quadrian Enterprises

8:30 p.m.

I will suggest using dmesg -w to monitor during dd the sector numbers that fail in order to skip them.

Also, perhaps the timeout of each read error is killing you (default 30 seconds) and you may have thousands.

On linux, /sys/block/<deviceName>/device/timeout (such as /sys/block/sda/device/timeout) is the timeout setting in seconds, which currently defaults to 30.

As root, echo 1 > /sys/block/<deviceName>/device/timeout will change the timeout to 1 second.

Perhaps this will help you achieve a DD without waiting for the read timeouts.

Erick.

On Sat, Sep 26, 2020, 2:27 PM Fred fred.fredex@gmail.com wrote:

...

Well, I'm not a noted expert on ddrescue, but my limited experience tells me that when it hits bad spots (or a big cluster of them) it can go very slowly as it tries multiple times to read each sector (or track, I'm not sure which, in this case). It keeps a list of bad spots and goes back at the end to try again to read something from them. Of course, if you've had, eg. a head crash, there's probably nothing there to read.

On Sat, Sep 26, 2020 at 1:41 PM Jerry Geis jerry.geis@gmail.com wrote:

...
Hello

I did try the "dd conv=noerror …" The ddrescue - doesnt stop - it just doesnt "continue" past a certain point. Somewhere around the 117G mark - it just doesnt go past that . (same with dd, gets to 117G and just doesnt continue. I have let the dd run all night - did not go past the 117G.

Thanks for any suggestions.

Jerry _______________________________________________ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos

Robert Nichols

27 Sep 27 Sep

3:47 a.m.

On 9/26/20 12:40 PM, Jerry Geis wrote:

...

Hello

I did try the "dd conv=noerror …" The ddrescue - doesnt stop - it just doesnt "continue" past a certain point. Somewhere around the 117G mark - it just doesnt go past that . (same with dd, gets to 117G and just doesnt continue. I have let the dd run all night - did not go past the 117G.

You can interrupt ddrescue and then resume with "-R" (--reverse) option. That will make it start from the end of the device and read backward toward the trouble area.

-- Bob Nichols "NOSPAM" is really part of my email address. Do NOT delete it.

Jerry Geis

28 Sep 28 Sep

11:43 a.m.

Thanks everyone for the suggestions. I finally got a completion with this command:

dd conv=noerror,sync iflag=direct bs=4096 if=/dev/sdb of=disk.img

Copying it now to see if it worked.

Jerry

Jerry Geis

2:26 p.m.

"It is alive"! Fantastic.

So I got a new SSD (500G) to replace the OLD rotating disk (320G) and played with trying to copy off the data for days... Finally got that with everyones help. Today I copied the data to the new 500G disk and it BOOTED and running.

Monday is way better than Friday was! Thanks and have a great day.

Jerry

1767

Age (days ago)

1769

Last active (days ago)

discuss@lists.centos.org

11 comments

6 participants

tags (0)

participants (6)

Erick Perez - Quadrian Enterprises
Fred
Jerry Geis
Robert Nichols
tony＠softins.co.uk
Valeri Galtsev