[CentOS] nscd segfaulting on centos 4.5

Wed Oct 10 14:47:03 UTC 2007
Craig White <craigwhite at azapple.com>

On Wed, 2007-10-10 at 08:16 -0400, Andy Harrison wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> 
> On 10/9/07, jlee  wrote:
> > output from /var/log/messages
> > Oct  9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
> > Oct  9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
> 
> 
> I'm starting to have this problem as well.  I have two mail servers
> running courier and postfix.  They've been up for a couple weeks but I
> just put them into production monday this week, two days ago.
> 
> Oct  9 07:34:49 ash kernel: nscd[3455]: segfault at 0000000040201000
> rip 0000555555563274 rsp 00000000401a1df0 error 6
> Oct  9 07:35:20 ash nscd: 27206 invalid persistent database file
> "/var/db/nscd/passwd": verification failed
> 
> 
> Oct 10 07:33:37 oak kernel: nscd[25051]: segfault at 0000000040201000
> rip 0000555555563274 rsp 00000000401a73a0 error 6
> Oct 10 07:33:48 oak nscd: 29526 invalid persistent database file
> "/var/db/nscd/passwd": verification failed
> 
> The first time it had happened, I was using the stock /etc/nscd.conf
> file.  The second time it happened on the other server, I had doubled
> the max-db-size passwd value to 67108864.
> 
> Both servers are running CentOS 5, firewall disabled and no SELinux .
> 
> Linux ash 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64
> x86_64 x86_64 GNU/Linux
> 
> (24)[11:58am] # yum list nscd
> nscd.x86_64                              2.5-12                 installed
> 
> 
> 
> # ls -la /etc/ldap*
> lrwxrwxrwx 1 root root   18 Sep 27 15:14 /etc/ldap.conf -> openldap/ldap.conf
> lrwxrwxrwx 1 root root   20 Sep 27 15:14 /etc/ldap.secret ->
> openldap/ldap.secret
> # ls -la /etc/openldap/ldap.*
> - -rw-r--r-- 1 root root 8974 Sep 27 13:55 /etc/openldap/ldap.conf
> - -rw------- 1 root root   10 Sep 27 13:56 /etc/openldap/ldap.secret
> 
> 
> My ldap.conf
> # grep '^[^#]' /etc/ldap.conf
> base dc=xxxxxxx,dc=xxx
> uri ldap://ldap-1.xxxxxxx.xxx
> binddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx
> bindpw xxxxxxxx
> rootbinddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx
> scope sub
> timelimit 30
> bind_timelimit 30
> bind_policy soft
> idle_timelimit 3600
> pam_check_host_attr yes
> nss_base_passwd dc=xxxxxxx,dc=net?sub
> nss_base_shadow dc=xxxxxxx,dc=net?sub
> pam_password clear
> nss_base_group          ou=Group,dc=xxxxxxx,dc=xxx?one
> TLS_REQCERT request
> TLS_CACERT /usr/local/etc/openldap/certs/cacert.pem
> 
> The two previous servers did not have this particular problem.  They
> were not identical hardware, but identical os install and config,
> 
> Any clues?
---
I don't generally use nscd any longer but since it is a dynamic system,
why not just stop nscd and delete the db and then restart nscd service
since it is certain to recreate it? (or perhaps move it out of the way
to be safe)...

/sbin/service nscd stop
mv /var/db/nscd/* /tmp
/sbin/service nscd start

Craig