[CentOS] Re: reboot long uptimes?

Drew Weaver spake the following on 2/13/2007 7:03 AM:
>  
> 
> -----Original Message-----
> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
> Behalf Of Johnny Hughes
> Sent: Tuesday, February 13, 2007 6:30 AM
> To: CentOS ML
> Subject: Re: [CentOS] reboot long uptimes?
> 
> On Tue, 2007-02-13 at 12:06 +0100, D Ivago wrote:
>> Hi,
>>
>> I was just wondering if I should reboot some servers that are running 
>> over 180 days?
>>
>> They are still stable and have no problems, also top shows no zombie 
>> processes or  such, but maybe it's better for the hardware (like ext3 
>> disk checks f.e.) to reboot  every six months...
> 
> I only reboot on kernel upgrades, that is usually more often than 6
> months.  But if you don't need to reboot for that reason, I would not
> reboot at all.
> 
>> btw  this uptime really confirms me how stable Centos 4.x really is 
>> and so  I wonder how long some people's uptimes on the list are ;)
>>
>> rmc
> 
> You should consider upgrading your kernels when security updates come
> out ... just to be safe.  Especially for machines touching the internet.
> 
> I usually upgrade my kernels because I like to use LVM snapshots for
> backups and that has only really started working semi-well since 4.3 and
> even better in 4.4 ... so most of my machines get rebooted every new
> kernel, which is at least 2-3 times a year (sometimes more often).
> 
> That being said, I do have a non internet facing machine that has not
> been rebooted since it was installed with CentOS-4.0 on it one March 1,
> 2005.  It is an internal router on my employer's infrastructure, and has
> been up for almost 2 years (and was installed on the day before CentOS-4
> was officially released).
> 
> Thanks,
> Johnny Hughes
> -------------
> 
> 	My uptime on some of our boxes are pretty bad, we have roughly
> 250 CentOS 4.x boxes here I'd say probably 25% of them initially suffer
> from some sort of bug with cpuspeed which causes kernel panics (until we
> disable cpuspeed), and then we have this other curious thing that
> happens with the filesystem where they will occasionally start spamming
> this "ext3-fs "Journal Has aborted" message until we reboot the boxes
> (nothing is wrong with the hardware in any of the cases).
> 
> Other than those 75 or so issues no problems at all.
> 
> -Drew
I have been seeing the ext3 errors also. I think it has something to do with
the crappy Adaptec raid card in the server. I'm going to replace them with
3ware 9550's as soon as I can work out the migration.

-- 

MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!