[CentOS] How do we handle panics? Bug it here or RH or ignore it (been around long time apparently)?

On Mon, 2006-09-04 at 18:57 +0200, J.J. Garcia wrote:
> El lun, 04-09-2006 a las 10:29 -0400, William L. Maltby escribió:
> > On Sat, 2006-09-02 at 14:31 -0400, William L. Maltby wrote:
> > > "Unable to handle kernel paging request". I've saved the OOPS data from
> > > the logs for 6 panics since the 4.4 update.
> > 
> > s/6/10/  # Now

s/10/11/   # Now, and my first double

> > 
> > > <snip><snip>

> > Have tried several combinations without success. Swapoff, 0
> > > /proc/sys/vm/swappiness, those two without each other, disabled the
> > readahead and readahead_early stuff, running in run level 3 only,
> > nothing's worked to keep it up. Need Robo-Viagra here.
> > 
> > Am currently running with swappiness at 10, swap enabled both readaheads
> > disabled. This is all the stuff that was gleaned from the prior CentOS
> > list discussions.

This last configuration looked about the same, but for the double panic
now.

> > 
> > Anyone got any other things I might try? I know a fix is not yet
> > <snip>

> > One new piece of info: a lot of the OOPS, but not all, have started
> > after the machine was idle and I touch the keyboard to bring the things
> > back to life (AFAIK, just a screen-saver, blanked, going). So I turned
> > off the BIOS ACPI stuff. Since I know zilch about the ACPI stuff, I've
> > begun reading /usr/share/doc stuff (kernel params and pm) to see what I
> > might disable in there. Maybe that will help some.
> > 
> > But if anyone has a couple suggestions regarding that, I might get the
> > docs read faster (fewer boots). I would appreciate it.
> > 
> > > Enough griping for today. Do I bugzilla CentOS, RH or ignore it?
> > 
> Bill,
> 
> Maybe it is not the point..., but have you considered the option of
> running 'badblock' (non-destructive/destructive after dd'ing to backup)?
> even on swap space? 

OUTSTANDING! I had not even considered. I was wrapped up in my own loop:
prev mobo bad, took a long time to ID that (was being led to believe
mem, after being led to believe it started with seamonkey, after.. you
can see that by now I am tightly focused).

Anyway, both relatively new 100GB commodity HDs. Had not though of that.
They are S>M>A>R>T capable. I'll do what you suggest and look at the
smartmon output.

> 
> As you said, it was after 4.4 update, new space on disk used to store
> new updates, maybe... 

It's even more than that. It is a new install of 4.3 on a unit built
with a new mobo to replace the one that I RMA'd. Memtest86 makes it seem
that memory is still OK. CPU temps are good, etc. So a lot of these disk
surfaces have not been used before. Since I have no history, I'll check
it.

But I'm still betting on the findings from my googling. Through 6.15
(IIRC) the OOPS has been identified and unsolved. Even saw one entry
where Linux was involved discussing some of the options.

> 
> Sure im wrong... but btw... sometimes when i have disk issues i also
> check temperature for transient failures due mobo/processor...

I took a look at the BIOS displayed temps last reboot (no cool-down
time, I ribbitted right away). Well below temps of concern. I need to
install gkrelm(sp?) for the graphical monitoring. I need to look and see
if there's a text version too.

> 
> 
> > Since I'm somewhat new here, and this is a recognized problem in the
> > community (if my googling is correct), I am still uncertain how I should
> > deal with this, other than the workaround. Can someone please respond to
> > my simple question: "Do I bugzilla CentOS, RH or ignore it?"
> > 
> > <snip>
> > 
> 
> What im doing (up to anybody tells me other thing) is to post the bug at
> Centos Bug Tracker, this is what we are using, this is where i think we
> have to toss the bugs, well, my way only of interpreting things...

Thanks. At least someone answered.

> 
> Have good luck Bill

Thanks. And for taking the time too.

> <snip sig stuff>

--
Bill