[CentOS] Buffer I/O error when booting

On Wed, 22 Jun 2005, Nikos Zaharioudakis wrote:

> try something like this.
> fdisk -l /dev/hda    (no 1,2 etc just the drive)
> if it gives you back some partition table then you have hope.

Even when it fails to return a partition table there is still hope.

I actually had problems, fdisk -l didn't work, partitions couldn't be 
mounted read-only. But after ddrescueing the complete disk I still had 
99.983% of my disk (about 30MB of the 160GB was lost, most of it on the 
recently formatted and therefor empty partition, only 2.2MB of real data 
was impacted)

Strangely, fdisk -l on the copied disk worked without a problem. According 
to ddrescues output:

	Bad region size 32768 at position 4096
	Bad region size 4096 at position 49152
	...

I verified fdisk's behaviour with strace and apparently it first reads the 
first 512 bytes, but then strangely reads the first 8k. Since that fails, 
I was out of luck.

> try to mount the drive with ro (read only option we do not need to
> change anything on it, we just want the data)

Don't do this after a reboot, chances are the disk is dying, might go bad 
while using it and the filesystem might not recover from the problems 
anyway. If the system was still running, you might want to copy files off 
the filesystem (since the disk might not come back after a reboot). If you 
rebooted the system, use a rescue CD (knoppix) that has dd_rescue on it 
and use that to copy the complete block-device.

You can also download gnu ddrescue to your knoppix ramdisk, compile it and 
use that. It is more sophisticated and you can first recover the major 
working areas with a big blocksize and later refine the bad areas (when 
the disk might be even less reliable).

> If this fails for any reason, try to dd the whole disk to another
> (good one) and then mounting and perhaps fsck might give you your data
> back.

dd will not work if you have bad blocks, it fails on the first bad block.

Also, before running fsck you might want to make a second copy, in case 
the fsck fails. You don't need to copy everything back from the broken 
disk, but instead try to refine copying the bad areas (some reports say 
freezing the disk in the freezer helps ?).

In my case it took 18h to copy the good parts, but refining the bad areas 
takes 4 secs for 1 bad block (512 bytes). So you might not want to do that 
at first or it could take a week (in my case 3 days for 30MB) refining 
every bad area.

--   dag wieers,  dag at wieers.com,  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]