[CentOS] File system goes read-only once in a while

Fri Aug 1 21:29:24 UTC 2008
NiftyClusters Mitch <niftycluster at niftyegg.com>

On Fri, Aug 1, 2008 at 1:43 PM, William L. Maltby
<CentOS4Bill at triad.rr.com> wrote:
> On Fri, 2008-08-01 at 16:13 -0400, Toby Bluhm wrote:
>> Mufit Eribol wrote:
>> ><snip>
..... that you would correctly try to
> fsck the *device*.

First backup data...
It is possible to run "fsck" with a media test flag.  Bad blocks are
assigned to
dummy files.   Inadvertently  reading one of these files can take a
drive off line.

One reason a device will go off line is the presence of a media error,
or the presence of a situation assumed by "smartd" to be a pending
data risk.....
Understanding the root cause error should be done.  Smartd tends to be cautious
but does identify pending problems.

One puzzle can be the loss of log file data.  It is sometimes possible
to see events
on a live system that later vanish after a reboot because buffers are
live in memory but not
on the disk.  Sending logs to another 'log system' can be helpful and
is a good idea
on production systems for exactly this reason.

 T o m M i t c h e l l