I've had problems with nscd crashing every few days on my CentOS 4
mail server for a while now. The problems started maybe around CentOS
4.2, although I don't remember for sure. My debugging efforts let me
to disable nscd's persistent cache, and that seemed to work for a
while, but since upgrading to CentOS 4.4, the crashes have started
again.
When nscd crashes, the rest of the system becomes unreliable. Dovecot
hangs when new users attempt to log in (fixed or worked around by
switching Dovecot versions), and MIMEDefang (which we use for mail
filtering) starts failing when it tries to process mail.
A bit more information about our configuration: We use nss_ldap and
pam_ldap against a failover pair of Fedora Directory Servers. nscd is
also crashing on another of our servers (a fileserver) but with less
regularlity; on our other servers, it works fine, even without
disabling the persistent cache.
Has anyone seen similar problems with nscd? Any tips for debugging
it? I can use gdb, strace, etc. if I have to, but that can be time
consuming, and it's hard to take the time to do that when people are
yelling about the mail server being down. (Does CentOS provide
debuginfo RPMs?)
Thanks.
Josh Kelley