Hi all,
I'm new to sssd configs and debugging. Recently we have encountered some problems with sssd. Basically 6 out of 50 servers has 'getent passwd' lost all userIDs from LDAP backend, while others are OK.
My sssd is at version 1.8.0-32. the related error messages are attached below. The sssd_nss seems got killed by temporarily network connection problems to backend openLDAP servers. Wonder why? and can we change the backend retry check interval? (see the timestamps for log entries in sssd_nss.log).
[root@testbox sssd]# cat sssd_nss.log (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_dp_init] (0x0010): Failed to connect to monitor services. (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_process_init] (0x0010): fatal error setting up backend connector (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_dp_init] (0x0010): Failed to connect to monitor services. (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_process_init] (0x0010): fatal error setting up backend connector (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_dp_init] (0x0010): Failed to connect to monitor services. (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_process_init] (0x0010): fatal error setting up backend connector (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_dp_init] (0x0010): Failed to connect to monitor services. (Sat Mar 2 02:30:41 2013) [sssd[nss]] [sss_process_init] (0x0010): fatal error setting up backend connector
[root@testbox sssd]# cat sssd_pam.log (Sat Mar 2 02:30:09 2013) [sssd[pam]] [pam_dp_reconnect_init] (0x0010): Could not reconnect to ldap provider. (Sat Mar 2 02:30:39 2013) [sssd[pam]] [pam_dp_reconnect_init] (0x0010): Could not reconnect to ldap provider.
[root@testbox sssd]# cat sssd_ldap.log (Sat Mar 2 02:30:53 2013) [sssd[be[ldap]]] [id_callback] (0x0010): The Monitor returned an error [org.freedesktop.DBus.Error.NoReply]
[root@testbox sssd]# cat sssd.log (Sat Mar 2 02:30:41 2013) [sssd] [mt_svc_exit_handler] (0x0010): Process [nss], definitely stopped! [root@testbox sssd]#
Please shed a light. Thanks a lot.
--Gelen