I've had problems with nscd crashing every few days on my CentOS 4 mail server for a while now. The problems started maybe around CentOS 4.2, although I don't remember for sure. My debugging efforts let me to disable nscd's persistent cache, and that seemed to work for a while, but since upgrading to CentOS 4.4, the crashes have started again.
When nscd crashes, the rest of the system becomes unreliable. Dovecot hangs when new users attempt to log in (fixed or worked around by switching Dovecot versions), and MIMEDefang (which we use for mail filtering) starts failing when it tries to process mail.
A bit more information about our configuration: We use nss_ldap and pam_ldap against a failover pair of Fedora Directory Servers. nscd is also crashing on another of our servers (a fileserver) but with less regularlity; on our other servers, it works fine, even without disabling the persistent cache.
Has anyone seen similar problems with nscd? Any tips for debugging it? I can use gdb, strace, etc. if I have to, but that can be time consuming, and it's hard to take the time to do that when people are yelling about the mail server being down. (Does CentOS provide debuginfo RPMs?)
Thanks.
Josh Kelley
On Saturday 07 October 2006 13:34, Josh Kelley wrote:
A bit more information about our configuration: We use nss_ldap and pam_ldap against a failover pair of Fedora Directory Servers. nscd is also crashing on another of our servers (a fileserver) but with less regularlity; on our other servers, it works fine, even without disabling the persistent cache.
Has anyone seen similar problems with nscd?
yes, me ^_^
i quit nscd services in ALL my servers at charge. I use LDAP too, for several services (samba, mail, dns, proxy, etc) and i prefer to install slave ldap servers in all my network instead of depend on nscd. nscd was always a pain to use, for example, some users change their password in samba, and some services continue to use the old password cached in nscd. i change some users from an ldap group to another, to give permissions to some things (samba shares, mail properties or proxy access rights) but the stupid nscd dont refresh the changes (i need to restart the nscd service)
i prefer slave ldap servers, and use systems that are ldap aware.
-- Black Hand Amiga Addicts
Hi Black Hand,
Just for comment, Decreasing the time for the cache into the nscd don't solve this problem?
On 10/7/06, Black Hand yonsy@blackhandchronicles.homeip.net wrote:
On Saturday 07 October 2006 13:34, Josh Kelley wrote:
A bit more information about our configuration: We use nss_ldap and pam_ldap against a failover pair of Fedora Directory Servers. nscd is also crashing on another of our servers (a fileserver) but with less regularlity; on our other servers, it works fine, even without disabling the persistent cache.
Has anyone seen similar problems with nscd?
yes, me ^_^
i quit nscd services in ALL my servers at charge. I use LDAP too, for several services (samba, mail, dns, proxy, etc) and i prefer to install slave ldap servers in all my network instead of depend on nscd. nscd was always a pain to use, for example, some users change their password in samba, and some services continue to use the old password cached in nscd. i change some users from an ldap group to another, to give permissions to some things (samba shares, mail properties or proxy access rights) but the stupid nscd dont refresh the changes (i need to restart the nscd service)
i prefer slave ldap servers, and use systems that are ldap aware.
-- Black Hand Amiga Addicts _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Josh Kelley wrote:
I've had problems with nscd crashing every few days on my CentOS 4 mail server for a while now. The problems started maybe around CentOS 4.2, although I don't remember for sure. My debugging efforts let me to disable nscd's persistent cache, and that seemed to work for a while, but since upgrading to CentOS 4.4, the crashes have started again.
...
Has anyone seen similar problems with nscd?
Yup, another me-too. Disabling nscd was the only reliable workaround/fix. :(
-- rex
On 10/8/06, Rex Dieter rdieter@math.unl.edu wrote:
Josh Kelley wrote:
I've had problems with nscd crashing every few days on my CentOS 4 mail server for a while now. The problems started maybe around CentOS 4.2, although I don't remember for sure. My debugging efforts let me to disable nscd's persistent cache, and that seemed to work for a while, but since upgrading to CentOS 4.4, the crashes have started again.
...
Has anyone seen similar problems with nscd?
Yup, another me-too. Disabling nscd was the only reliable workaround/fix. :(
Thanks for the feedback.
Any suggestions/experience with the system unreliability I'd seen after nscd crashed? Or does that improve if nscd is disabled entirely (instead of enabled but crashed, leaving a socket behind)?
Josh Kelley
I add to the list that NSCD in Centos 4.4 fully updated crashes every hour or so (this is a 5000+ email per hour mailserver with rbls, antispam and the whole enchilada)
The only entry i get is : nscd: 2472 invalid persistent database file "/var/db/nscd/hosts": file size does not match
So i disabled nscd. Problem fixed, no more horrible crashes from sendmail+rbl lists. I wont even bother to dig deeper, I cant do it in my production server.
When a new release of nscd arrives or if maybe i can test it in a non-heavyloaded mailserver, then i'll try it again....
nscd: you served well my friend (in 2.4 kernels)...but not anymore. R.I.P.
On 10/9/06, Josh Kelley joshkel@gmail.com wrote:
On 10/8/06, Rex Dieter rdieter@math.unl.edu wrote:
Josh Kelley wrote:
I've had problems with nscd crashing every few days on my CentOS 4 mail server for a while now. The problems started maybe around CentOS 4.2, although I don't remember for sure. My debugging efforts let me to disable nscd's persistent cache, and that seemed to work for a while, but since upgrading to CentOS 4.4, the crashes have started again.
...
Has anyone seen similar problems with nscd?
Yup, another me-too. Disabling nscd was the only reliable workaround/fix. :(
Thanks for the feedback.
Any suggestions/experience with the system unreliability I'd seen after nscd crashed? Or does that improve if nscd is disabled entirely (instead of enabled but crashed, leaving a socket behind)?
Josh Kelley _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sat, 2006-10-07 at 14:34 -0400, Josh Kelley wrote:
<snip>
(Does CentOS provide debuginfo RPMs?)
http://vault.centos.org/debuginfo/4/i386/
<snip sig stuff>
-- Bill