[CentOS] Emergency rescue help needed

Thu Jan 29 17:18:08 UTC 2009
Scott Silva <ssilva at sgvwater.com>

on 1-29-2009 9:02 AM Anne Wilson spake the following:
> 2009/1/29 Scott Silva <ssilva at sgvwater.com>:
>> on 1-29-2009 8:30 AM Anne Wilson spake the following:
>>> 2009/1/29 Alex H. Vandenham <alex-qMVNeVs1MAKw5LPnMra/2Q-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>:
>>>> On Thursday 29 January 2009 10:15:38 am Anne Wilson wrote:
>>>>> I assume that the hdd is failing - but I haven't seen any messages
>>>>> from smartmontools.  Is there any way I can check that?  If it is I
>>>>> don't want to waste time trying to repair it.
>>>> try smartctl to see what the monitors have been finding for you.
>>>>
>>>> man smartctl
>>>>
>>> Thanks.  I'd been trying to remember what command I needed for that :-)
>>>
>>> The short test has completed without errors.  I'll run the long test
>>> during dinner.  Assuming that that also runs without errors, I guess
>>> that the next thing is memtest?
>>>
>>> More suggestions?
>>>
>>> Thanks
>>>
>>> Anne
>> If you had many power failures, the filesystem might just be severely trashed.
>> Journals and files out of sync, etc... If a good fsck didn't fix it, you might
>> just be in for a wipe-reinstall, or many hours of finding and fixing corrupted
>> files.. I would install to a new drive, and then you can take some time
>> recovering from the old drive as you find things missing. That way you will
>> still have the old system for whatever might come up. I always seem to find
>> something that didn't get backed up properly.
>>
> Two days ago I discovered that the failures had indeed totally trashed
> the system.  I did re-install, formatting only / and /boot, but I've
> had a couple of these spontaneous shutdowns since then, which is why I
> suspected hardware failure.
> 
> I've got copies of just about everything, I think, on an external
> drive, and I could try another drive as you suggest, mounting the old
> one in an external case, which I have.  I can cope with this, but I'm
> deeply unhappy about not knowing what happened, and whether it is
> likely to happen again.
> 
> Anne
Are the failures power related, or is the system just shutting down on its own?

If the latter, I would suspect either a power supply or a processor fan. If
the former, maybe you need to invest in an inexpensive UPS.



-- 
MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 258 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20090129/fc9c5faa/attachment-0005.sig>