Out of the blue, dmesg on my HP Proliant w/ a SCSI disk gives loads of messages like this one:
EXT3-fs error (device dm-0) in start_transaction: Journal has aborted
Then the root fs goes read-only, so little else can be done on the
machine. LVM locks up. At restart, fs needs a reboot to recover after
fsck. The host starts up ok, then I am given some more minutes before
the problem reappears. This is stock CentOS 4.4, never have gotten to
update it because of this very same problem.
System logs say SCSI I/O error, but SMART says no problem has been
found, neither does badblocks (run from a rescue CD bootup). SCSI
cabling, terminator, etc has been checked.
What should I investigate next? Is the disk condemned?
TIA
--
Eduardo Grosclaude
Universidad Nacional del Comahue
Neuquen, Argentina