On Mon, Aug 25, 2008 at 10:43:01AM +0200, Lorenzo Quatrini wrote:
William L. Maltby ha scritto:
Yep. Only a few copies of the superblock and the i-node tables are written by the file system make process. That's why it's important for files systems in critical applications to be created with the check forced. Folks should also keep in mind that the default check, read only, is really not sufficient for critical situations. The full write/read check should be forced on *new* partitions/disks.
So again my question is: can I use dd to "test" the disk? what about
dd if=/dev/sda of=/dev/sda bs=512
Is this safe on a full running system? Has to be done at runlevel 1 or with a live cd? I think this is "better" than the manufactureur way, as dd is always present and works with any brand.
It is not safe on a mounted filesystem or devices with mounted filesystems.
File system code on a partition will have no coherency interaction with the entire raw device.
See the -f flag in the "badblocks" man page: "-f Normally, badblocks will refuse to do a read/write or a non- destructive test on a device which is mounted, since either can cause the system to potentially crash and/or damage the filesys- tem even if ....."
It is also not 100% clear to me that the kernel buffer code will not see a paired set of "dd" commands as a no op and skip the write.
Vendor tools on an unmounted disk operate at a raw level and also have access to the vendor specific embedded controller commands bypassing buffering and directly interacting with error codes and retry counts and more.
In normal operation the best opportunity to spare a sector or track is on a write..... At that time the OS, and disk both have known good data so a read after write can detect the defect/ error and take the necessary action without loss of data. Some disks have read heads that follow the write heads to this end. Other disks require an additional revolution....
When "mke2fs -c -c " is invoked the second -c flag is important because the paired read/write can let the firmware on the disk map detected defects to spares. With a single "-c" flag the Linux filesystem code can assign the error blocks to non files . A system admin that does a dd read of a problem disk may find that the OS hurls on the errors and takes the device off line. i.e. this command: dd if=/dev/sda of=/dev/sda bs=512 might not do the expected because the first read can take the device off line negating the follow up write intended to fix things.
The tool "hdparm: is rich in info -- some flags are dangerous.
Bottom line... use vendor tools.... Vendors like error reports from their tools for RMA processing and warranty...
BTW: smartd is a good thing. For me any disk that smartd had made noise about has failed... often with weeks or months of warning...