[CentOS] CentOS Stream 8 sssd.service failing part of sssd-common-2.8.1-1.el8.x86_64 baseos package

Thu Jan 19 16:32:54 UTC 2023
Jelle de Jong <jelledejong at powercraft.nl>

On 1/13/23 11:52, Leon Fauster via CentOS wrote:
> Am 13.01.23 um 05:34 schrieb Orion Poplawski:
>> On 12/30/22 04:06, Jelle de Jong wrote:
>>> On 12/27/22 22:55, Gordon Messmer wrote:
>>>> On 2022-12-25 07:44, Jelle de Jong wrote:
>>>>> A recent update of the sssd-common-2.8.1-1.el8.x86_64 package is 
>>>>> causing sssd.service systemctl failures all over my CentosOS machines.
>>>> ...
>>>>> [sssd] [confdb_expand_app_domains] (0x0010): No domains configured, 
>>>>> fatal error! 
>>>>
>>>>
>>>> Were you previously using sssd?  Or is the problem merely that it is 
>>>> now reporting an error starting a service that you don't use?
>>>>
>>>> Are there any files in /etc/sssd/conf.d, or does /etc/sssd/sssd.conf 
>>>> exist?  If so, what are the contents of those files?
>>>>
>>>> What are the contents of /usr/lib/systemd/system/sssd.service?
>>>>
>>>> If you run "journalctl -u sssd.service", are there any log entries 
>>>> older than the package update?
>>>
>>> I got a monitoring system for failing services and I sudenly started 
>>> getting dozens of notifications for all my CentOS systems that sssd 
>>> was failing. This is after the sssd package updates, causing this 
>>> regression. SSSD services where not really in use but some of the 
>>> common libraries are used.
>>>
>>> # systemctl status sssd
>>> ● sssd.service - System Security Services Daemon
>>>     Loaded: loaded (/usr/lib/systemd/system/sssd.service; enabled; 
>>> vendor preset: enabled)
>>>     Active: failed (Result: exit-code) since Sat 2022-12-24 06:14:10 
>>> UTC; 6 days ago
>>> Condition: start condition failed at Fri 2022-12-30 11:02:01 UTC; 4s ago
>>>             ├─ ConditionPathExists=|/etc/sssd/sssd.conf was not met
>>>             └─ ConditionDirectoryNotEmpty=|/etc/sssd/conf.d was not met
>>>   Main PID: 3953157 (code=exited, status=4)
>>>
>>> Warning: Journal has been rotated since unit was started. Log output 
>>> is incomplete or unavailable.
>>> # ls -halt /etc/sssd/conf.d/
>>> total 8.0K
>>> drwx--x--x. 2 sssd sssd 4.0K Dec  8 13:08 .
>>> drwx------. 4 sssd sssd 4.0K Dec  8 13:08 ..
>>> # ls -halZ /etc/sssd/conf.d/
>>> total 8.0K
>>> drwx--x--x. 2 sssd sssd system_u:object_r:sssd_conf_t:s0 4.0K Dec  8 
>>> 13:08 .
>>> drwx------. 4 sssd sssd system_u:object_r:sssd_conf_t:s0 4.0K Dec  8 
>>> 13:08 ..
>>> # ls -halZ /etc/sssd/sssd.conf
>>> ls: cannot access '/etc/sssd/sssd.conf': No such file or directory
>>>
>>> # journalctl -u sssd.service --lines 100000
>>> -- Logs begin at Mon 2022-12-26 22:15:31 UTC, end at Fri 2022-12-30 
>>> 11:05:26 UTC. --
>>> -- No entries --
>>>
>>> Kind regards,
>>>
>>> Jelle de Jong
>>
>> I don't quite understand where this:
>>     Main PID: 3953157 (code=exited, status=4)
>>
>> came from.  As it seems like sssd was started at some point and 
>> failed. But that shouldn't have happened because:
>>
>> Condition: start condition failed at Fri 2022-12-30 11:02:01 UTC; 4s ago
>>              ├─ ConditionPathExists=|/etc/sssd/sssd.conf was not met
>>              └─ ConditionDirectoryNotEmpty=|/etc/sssd/conf.d was not met
>>
>> It's telling you that because /etc/sssd/sssd.conf does not exist and 
>> /etc/sssd/sssd.conf.d is not empty, the service was not started 
>> because the conditions were not met.  This is as expected in your case.
>>
>> If you don't want it to even check, just disable the service:
>>
>> systemctl disable sssd.service
>>
> 
> 
> Before doing this; @OP: what's the output of:
> 
> # authselect current

]# authselect current
Profile ID: sssd
Enabled features: None

I wrote the following Ansible code to automate disabling the sssd 
service.... I still consider this a regression as it just started 
apearing on all the systems.

- name: get sssd service status
   ansible.builtin.systemd:
     name: sssd.service
   register: sssd

- name: disable sssd.service service status
   ansible.builtin.systemd:
     name: sssd.service
     enabled: false
     state: stopped
   when:
     - sssd.status.ActiveState is defined
     - sssd.status.ActiveState == "failed"

- name: systemctl reset-failed
   command: systemctl reset-failed
   args:
     warn: false
   when:
     - sssd.status.ActiveState is defined
     - sssd.status.ActiveState == "failed"