[CentOS] Server spontaneously rebooting under RHEL-4

Tue Mar 28 05:45:57 UTC 2006
Benjamin J. Weiss <benjamin at birdvet.org>

Hey, y'all! :)

I've got an RHEL-4 server (yep, I know it's not CentOS, but hey we gotta 
send some money RH's way to keep CentOS up and going! ) that's running 
Oracle 10g.  This same hardware worked just fine for over a year running 
RHEL-AS-2.1 and Oracle 9i.  Now we're getting spontaneous reboots when 
running oracle processes that eat up a bunch of resources.  I don't know 
where to go from here.

It's got dual hyper-threading processors set to hyperthreading mode, and 
I understand that the 2.6 kernel used to have HT issues, but I thought 
that'd been solved.  The kernel we're running is:  2.6.9-22.0.2.ELsmp 
(yeah, not the latest, I haven't had a chance lately to test and update 
the patches).

I think the kernel settings are correct, what with 4gigs of ram:

[root at sibrsdbs etc]# cat sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# oracle settings
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
#fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144


I don't know how to look for the core dump, if there was one.  I don't 
see anything named 'core' in the /root directory.

I'm sucking wind, any suggestions?

Thanks!

Ben