Does anyone know if there is a fix for nscd segfaulting after a short period of time. Googling for it came up with one result that suggested deleting the files in /var/db/nscd , but that didn't help. Another result was about run away processes which is not the problem I'm having.
They are x86_64 boxes.
output from /var/log/messages Oct 9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 Oct 9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
output from dmesg nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 nscd[13640]: segfault at 0000002b401fee8b rip 000000552aab7946 rsp 0000000040a039e0 error 4
output from uname 2.6.9-55.0.9.ELsmp #1 SMP Thu Sep 27 18:28:00 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/9/07, jlee wrote:
output from /var/log/messages Oct 9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 Oct 9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
I'm starting to have this problem as well. I have two mail servers running courier and postfix. They've been up for a couple weeks but I just put them into production monday this week, two days ago.
Oct 9 07:34:49 ash kernel: nscd[3455]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a1df0 error 6 Oct 9 07:35:20 ash nscd: 27206 invalid persistent database file "/var/db/nscd/passwd": verification failed
Oct 10 07:33:37 oak kernel: nscd[25051]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a73a0 error 6 Oct 10 07:33:48 oak nscd: 29526 invalid persistent database file "/var/db/nscd/passwd": verification failed
The first time it had happened, I was using the stock /etc/nscd.conf file. The second time it happened on the other server, I had doubled the max-db-size passwd value to 67108864.
Both servers are running CentOS 5, firewall disabled and no SELinux .
Linux ash 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
(24)[11:58am] # yum list nscd nscd.x86_64 2.5-12 installed
# ls -la /etc/ldap* lrwxrwxrwx 1 root root 18 Sep 27 15:14 /etc/ldap.conf -> openldap/ldap.conf lrwxrwxrwx 1 root root 20 Sep 27 15:14 /etc/ldap.secret -> openldap/ldap.secret # ls -la /etc/openldap/ldap.* - -rw-r--r-- 1 root root 8974 Sep 27 13:55 /etc/openldap/ldap.conf - -rw------- 1 root root 10 Sep 27 13:56 /etc/openldap/ldap.secret
My ldap.conf # grep '^[^#]' /etc/ldap.conf base dc=xxxxxxx,dc=xxx uri ldap://ldap-1.xxxxxxx.xxx binddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx bindpw xxxxxxxx rootbinddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx scope sub timelimit 30 bind_timelimit 30 bind_policy soft idle_timelimit 3600 pam_check_host_attr yes nss_base_passwd dc=xxxxxxx,dc=net?sub nss_base_shadow dc=xxxxxxx,dc=net?sub pam_password clear nss_base_group ou=Group,dc=xxxxxxx,dc=xxx?one TLS_REQCERT request TLS_CACERT /usr/local/etc/openldap/certs/cacert.pem
The two previous servers did not have this particular problem. They were not identical hardware, but identical os install and config,
Any clues?
- -- Andy Harrison public key: 0x67518262
On Wed, 2007-10-10 at 08:16 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/9/07, jlee wrote:
output from /var/log/messages Oct 9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 Oct 9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
I'm starting to have this problem as well. I have two mail servers running courier and postfix. They've been up for a couple weeks but I just put them into production monday this week, two days ago.
Oct 9 07:34:49 ash kernel: nscd[3455]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a1df0 error 6 Oct 9 07:35:20 ash nscd: 27206 invalid persistent database file "/var/db/nscd/passwd": verification failed
Oct 10 07:33:37 oak kernel: nscd[25051]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a73a0 error 6 Oct 10 07:33:48 oak nscd: 29526 invalid persistent database file "/var/db/nscd/passwd": verification failed
The first time it had happened, I was using the stock /etc/nscd.conf file. The second time it happened on the other server, I had doubled the max-db-size passwd value to 67108864.
Both servers are running CentOS 5, firewall disabled and no SELinux .
Linux ash 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
(24)[11:58am] # yum list nscd nscd.x86_64 2.5-12 installed
# ls -la /etc/ldap* lrwxrwxrwx 1 root root 18 Sep 27 15:14 /etc/ldap.conf -> openldap/ldap.conf lrwxrwxrwx 1 root root 20 Sep 27 15:14 /etc/ldap.secret -> openldap/ldap.secret # ls -la /etc/openldap/ldap.*
- -rw-r--r-- 1 root root 8974 Sep 27 13:55 /etc/openldap/ldap.conf
- -rw------- 1 root root 10 Sep 27 13:56 /etc/openldap/ldap.secret
My ldap.conf # grep '^[^#]' /etc/ldap.conf base dc=xxxxxxx,dc=xxx uri ldap://ldap-1.xxxxxxx.xxx binddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx bindpw xxxxxxxx rootbinddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx scope sub timelimit 30 bind_timelimit 30 bind_policy soft idle_timelimit 3600 pam_check_host_attr yes nss_base_passwd dc=xxxxxxx,dc=net?sub nss_base_shadow dc=xxxxxxx,dc=net?sub pam_password clear nss_base_group ou=Group,dc=xxxxxxx,dc=xxx?one TLS_REQCERT request TLS_CACERT /usr/local/etc/openldap/certs/cacert.pem
The two previous servers did not have this particular problem. They were not identical hardware, but identical os install and config,
Any clues?
--- I don't generally use nscd any longer but since it is a dynamic system, why not just stop nscd and delete the db and then restart nscd service since it is certain to recreate it? (or perhaps move it out of the way to be safe)...
/sbin/service nscd stop mv /var/db/nscd/* /tmp /sbin/service nscd start
Craig
Craig White wrote:
On Wed, 2007-10-10 at 08:16 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/9/07, jlee wrote:
output from /var/log/messages Oct 9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 Oct 9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
I'm starting to have this problem as well. I have two mail servers running courier and postfix. They've been up for a couple weeks but I just put them into production monday this week, two days ago.
Oct 9 07:34:49 ash kernel: nscd[3455]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a1df0 error 6 Oct 9 07:35:20 ash nscd: 27206 invalid persistent database file "/var/db/nscd/passwd": verification failed
Oct 10 07:33:37 oak kernel: nscd[25051]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a73a0 error 6 Oct 10 07:33:48 oak nscd: 29526 invalid persistent database file "/var/db/nscd/passwd": verification failed
The first time it had happened, I was using the stock /etc/nscd.conf file. The second time it happened on the other server, I had doubled the max-db-size passwd value to 67108864.
Both servers are running CentOS 5, firewall disabled and no SELinux .
Linux ash 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
(24)[11:58am] # yum list nscd nscd.x86_64 2.5-12 installed
# ls -la /etc/ldap* lrwxrwxrwx 1 root root 18 Sep 27 15:14 /etc/ldap.conf -> openldap/ldap.conf lrwxrwxrwx 1 root root 20 Sep 27 15:14 /etc/ldap.secret -> openldap/ldap.secret # ls -la /etc/openldap/ldap.*
- -rw-r--r-- 1 root root 8974 Sep 27 13:55 /etc/openldap/ldap.conf
- -rw------- 1 root root 10 Sep 27 13:56 /etc/openldap/ldap.secret
My ldap.conf # grep '^[^#]' /etc/ldap.conf base dc=xxxxxxx,dc=xxx uri ldap://ldap-1.xxxxxxx.xxx binddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx bindpw xxxxxxxx rootbinddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx scope sub timelimit 30 bind_timelimit 30 bind_policy soft idle_timelimit 3600 pam_check_host_attr yes nss_base_passwd dc=xxxxxxx,dc=net?sub nss_base_shadow dc=xxxxxxx,dc=net?sub pam_password clear nss_base_group ou=Group,dc=xxxxxxx,dc=xxx?one TLS_REQCERT request TLS_CACERT /usr/local/etc/openldap/certs/cacert.pem
The two previous servers did not have this particular problem. They were not identical hardware, but identical os install and config,
Any clues?
I don't generally use nscd any longer but since it is a dynamic system, why not just stop nscd and delete the db and then restart nscd service since it is certain to recreate it? (or perhaps move it out of the way to be safe)...
/sbin/service nscd stop mv /var/db/nscd/* /tmp /sbin/service nscd start
Craig
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
I tried deleting the db files on one of th boxes after seeing this on the web, but nscd segfaulted less than half an hour later. This problem seems to happen only with x86_64 boxes. Another box here is x86_32 and has no issues with nscd.
I would like to drop this service but there are critical apps that require it since authentication comes through openldap. It does not seem to be hardware specific since the two x86_64 boxes have different mobo, one abit and one asus.
The logger is turned on for nscd but nothing looks unusual in them, and it has been difficult finding which pid precedes the segfault.
Can malformed addresses cause nscd to segfault?
On Wed, 2007-10-10 at 10:19 -0500, jlee wrote:
Craig White wrote:
On Wed, 2007-10-10 at 08:16 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/9/07, jlee wrote:
output from /var/log/messages Oct 9 12:56:38 lyra kernel: nscd[11660]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4 Oct 9 13:16:38 lyra kernel: nscd[12540]: segfault at 0000002b401fee8b rip 000000552aab7966 rsp 00000000408029e0 error 4
I'm starting to have this problem as well. I have two mail servers running courier and postfix. They've been up for a couple weeks but I just put them into production monday this week, two days ago.
Oct 9 07:34:49 ash kernel: nscd[3455]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a1df0 error 6 Oct 9 07:35:20 ash nscd: 27206 invalid persistent database file "/var/db/nscd/passwd": verification failed
Oct 10 07:33:37 oak kernel: nscd[25051]: segfault at 0000000040201000 rip 0000555555563274 rsp 00000000401a73a0 error 6 Oct 10 07:33:48 oak nscd: 29526 invalid persistent database file "/var/db/nscd/passwd": verification failed
The first time it had happened, I was using the stock /etc/nscd.conf file. The second time it happened on the other server, I had doubled the max-db-size passwd value to 67108864.
Both servers are running CentOS 5, firewall disabled and no SELinux .
Linux ash 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
(24)[11:58am] # yum list nscd nscd.x86_64 2.5-12 installed
# ls -la /etc/ldap* lrwxrwxrwx 1 root root 18 Sep 27 15:14 /etc/ldap.conf -> openldap/ldap.conf lrwxrwxrwx 1 root root 20 Sep 27 15:14 /etc/ldap.secret -> openldap/ldap.secret # ls -la /etc/openldap/ldap.*
- -rw-r--r-- 1 root root 8974 Sep 27 13:55 /etc/openldap/ldap.conf
- -rw------- 1 root root 10 Sep 27 13:56 /etc/openldap/ldap.secret
My ldap.conf # grep '^[^#]' /etc/ldap.conf base dc=xxxxxxx,dc=xxx uri ldap://ldap-1.xxxxxxx.xxx binddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx bindpw xxxxxxxx rootbinddn cn=foo,ou=bar,dc=xxxxxxx,dc=xxx scope sub timelimit 30 bind_timelimit 30 bind_policy soft idle_timelimit 3600 pam_check_host_attr yes nss_base_passwd dc=xxxxxxx,dc=net?sub nss_base_shadow dc=xxxxxxx,dc=net?sub pam_password clear nss_base_group ou=Group,dc=xxxxxxx,dc=xxx?one TLS_REQCERT request TLS_CACERT /usr/local/etc/openldap/certs/cacert.pem
The two previous servers did not have this particular problem. They were not identical hardware, but identical os install and config,
Any clues?
I don't generally use nscd any longer but since it is a dynamic system, why not just stop nscd and delete the db and then restart nscd service since it is certain to recreate it? (or perhaps move it out of the way to be safe)...
/sbin/service nscd stop mv /var/db/nscd/* /tmp /sbin/service nscd start
I tried deleting the db files on one of th boxes after seeing this on the web, but nscd segfaulted less than half an hour later. This problem seems to happen only with x86_64 boxes. Another box here is x86_32 and has no issues with nscd.
I would like to drop this service but there are critical apps that require it since authentication comes through openldap. It does not seem to be hardware specific since the two x86_64 boxes have different mobo, one abit and one asus.
The logger is turned on for nscd but nothing looks unusual in them, and it has been difficult finding which pid precedes the segfault.
Can malformed addresses cause nscd to segfault?
---- I don't know the answer to that but it would seem that if that were the case, the problem would exist with i386 version.
I suppose you will have to attach an strace to the pid and then create a bugzilla entry with attached strace - probably on the upstream provider.
As for 'critical apps that require' nscd...I don't personally know of any and if we are talking about CentOS-5 which has 2.3.27 version of openldap...the 2.3.x versions are very fast and I'm not certain that nscd is of all that much benefit (but I don't know because I have never tested it out).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
As for 'critical apps that require' nscd...I don't personally know of any and if we are talking about CentOS-5 which has 2.3.27 version of openldap...the 2.3.x versions are very fast and I'm not certain that nscd is of all that much benefit (but I don't know because I have never tested it out).
Can CentOS (openldap) be configured to work without nscd for file ownership over nfs mounted volumes?
- -- Andy Harrison public key: 0x67518262
On Wed, 2007-10-10 at 14:20 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
As for 'critical apps that require' nscd...I don't personally know of any and if we are talking about CentOS-5 which has 2.3.27 version of openldap...the 2.3.x versions are very fast and I'm not certain that nscd is of all that much benefit (but I don't know because I have never tested it out).
Can CentOS (openldap) be configured to work without nscd for file ownership over nfs mounted volumes?
---- obviously, I don't understand the question because I have users mounting both their home directories and the common files via NFS and I don't use nscd...
[root@srv1 craig]# ps aux|grep nfs root 3934 0.0 0.0 0 0 ? S< May19 0:00 [nfsd4] root 3935 0.0 0.0 0 0 ? S May19 8:22 [nfsd] root 3936 0.0 0.0 0 0 ? S May19 8:36 [nfsd] root 3937 0.0 0.0 0 0 ? S May19 8:31 [nfsd] root 3938 0.0 0.0 0 0 ? S May19 8:20 [nfsd] root 3939 0.0 0.0 0 0 ? S May19 8:24 [nfsd] root 3940 0.0 0.0 0 0 ? S May19 8:23 [nfsd] root 3941 0.0 0.0 0 0 ? S May19 8:17 [nfsd] root 3942 0.0 0.0 0 0 ? S May19 8:32 [nfsd] root 28661 0.0 0.0 3888 728 pts/16 S+ 11:23 0:00 grep nfs
[root@srv1 craig]# service nscd status nscd is stopped
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
obviously, I don't understand the question because I have users mounting both their home directories and the common files via NFS and I don't use nscd...
But do the user accounts exist in the local passwd file or in ldap?
- -- Andy Harrison public key: 0x67518262
On Wed, 2007-10-10 at 14:28 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
obviously, I don't understand the question because I have users mounting both their home directories and the common files via NFS and I don't use nscd...
But do the user accounts exist in the local passwd file or in ldap?
---- user accounts in ldap
accounts < 500 in /etc/passwd
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
user accounts in ldap
accounts < 500 in /etc/passwd
Could you provide some more detail? Until I rig up nscd, when I look at an nfs volume, I see nothing but uid's and gid's for the file ownership. Aside from 4 or 5 additional accounts, my passwd file is stock. All my accounts (50,000+) are in ldap.
- -- Andy Harrison public key: 0x67518262
On Wed, 2007-10-10 at 14:54 -0400, Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
user accounts in ldap
accounts < 500 in /etc/passwd
Could you provide some more detail? Until I rig up nscd, when I look at an nfs volume, I see nothing but uid's and gid's for the file ownership. Aside from 4 or 5 additional accounts, my passwd file is stock. All my accounts (50,000+) are in ldap.
---- what's to detail?
[root@srv1 craig]# grep passwd /etc/nsswitch.conf #passwd: db files nisplus nis passwd: files ldap
[root@srv1 craig]# getent passwd|grep craig craig:x:1000:100:Craig White:/home/storage/users/craig:/bin/sh
uid/gid's as numbers might be a little less convenient but still operable.
[root@srv1 craig]# ls -ld /home/storage/users/craig/ drwx--x--x 116 craig users 12288 Oct 9 14:00 /home/storage/users/craig/
but it still shows user 'craig' and user 'craig' is in ldap and nscd is indeed off
Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
As for 'critical apps that require' nscd...I don't personally know of any and if we are talking about CentOS-5 which has 2.3.27 version of openldap...the 2.3.x versions are very fast and I'm not certain that nscd is of all that much benefit (but I don't know because I have never tested it out).
Can CentOS (openldap) be configured to work without nscd for file ownership over nfs mounted volumes?
Andy Harrison
Problem solved (kind of). Openldap was working for logins, but not for launching certain apps, that's why nscd was installed. Launching acroread with strace showed the following.
<snip> [2]$ strace /usr/local/Adobe/Acrobat7.0/bin/acroread 2>&1|tee| grep nss open("/etc/nsswitch.conf", O_RDONLY) = 4 read(4, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1658 open("/usr/local/Adobe/Acrobat7.0/Reader/intellinux/lib/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_files.so.2", O_RDONLY) = 4 open("/usr/local/Adobe/Acrobat7.0/Reader/intellinux/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) </snip>
With the i386 libs for ldap installed acroread along with other programs were able to get their user id authentication and run properly, therefore nscd was no longer needed.
This did not solve the mystery of why nscd was dying, just eliminated the need for it. Here is part of the strace on nscd (4096 is the pid). There is a lot of stuff above this, but the end where is segfaults always looks pretty much the same.
<snip> geteuid32() = 430 open("/etc/passwd", O_RDONLY) = 4 fcntl64(4, F_GETFD) = 0 fcntl64(4, F_SETFD, FD_CLOEXEC) = 0 fstat64(0x4, 0xffffcd2c) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7429000 read(4, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1946 read(4, "", 4096) = 0 close(4) = 0 munmap(0xf7429000, 4096) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- exit_group(1) = ? Process 27033 detached </snip>
Haven't tested to see if the i386 libnss_ldap fixed the nscd issue.
on 10/10/2007 1:01 PM jlee spake the following:
Andy Harrison wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 10/10/07, Craig White wrote:
As for 'critical apps that require' nscd...I don't personally know of any and if we are talking about CentOS-5 which has 2.3.27 version of openldap...the 2.3.x versions are very fast and I'm not certain that nscd is of all that much benefit (but I don't know because I have never tested it out).
Can CentOS (openldap) be configured to work without nscd for file ownership over nfs mounted volumes?
Andy Harrison
Problem solved (kind of). Openldap was working for logins, but not for launching certain apps, that's why nscd was installed. Launching acroread with strace showed the following.
<snip> [2]$ strace /usr/local/Adobe/Acrobat7.0/bin/acroread 2>&1|tee| grep nss open("/etc/nsswitch.conf", O_RDONLY) = 4 read(4, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1658 open("/usr/local/Adobe/Acrobat7.0/Reader/intellinux/lib/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_files.so.2", O_RDONLY) = 4 open("/usr/local/Adobe/Acrobat7.0/Reader/intellinux/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/i686/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/tls/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib/libnss_ldap.so.2", O_RDONLY) = -1 ENOENT (No such file or directory) </snip>
With the i386 libs for ldap installed acroread along with other programs were able to get their user id authentication and run properly, therefore nscd was no longer needed.
This did not solve the mystery of why nscd was dying, just eliminated the need for it. Here is part of the strace on nscd (4096 is the pid). There is a lot of stuff above this, but the end where is segfaults always looks pretty much the same.
<snip> geteuid32() = 430 open("/etc/passwd", O_RDONLY) = 4 fcntl64(4, F_GETFD) = 0 fcntl64(4, F_SETFD, FD_CLOEXEC) = 0 fstat64(0x4, 0xffffcd2c) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0x1000) = 0xfffffffff7429000 read(4, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1946 read(4, "", 4096) = 0 close(4) = 0 munmap(0xf7429000, 4096) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- exit_group(1) = ? Process 27033 detached </snip>
Haven't tested to see if the i386 libnss_ldap fixed the nscd issue.
nscd has been flaky since CentOS3. I had segfaults way back then. They were random and irritating, hard to trace down to the cause, and finding ways to not use it were the norm and not the exception. I don't think nscd is as useful as it used to be with the slower services like NIS.