Simon Matter via CentOS wrote: > >> We are seeing a problem that occurs ~5% of the time when rebooting > > I see such issues on a quite large multi user system but when this > happens, after forced restarts for kernel updates, I usually don't have > the time to analyze and play doctor on it. My "solution" now is to simply > reboot the server again in such a case, AKA the systemd way :-) > >> CentOS 7.7 where systemd gets a 'Connection timed out' to D-Bus just >> after the D-Bus service starts - from 'journalctl -x' : >> >> ... >> Jan 21 16:09:59 linux7-7.mpc.local systemd[1]: Started D-Bus System >> Message Bus. >> -- Subject: Unit dbus.service has finished start-up >> -- Defined-By: systemd >> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >> -- >> -- Unit dbus.service has finished starting up. >> -- >> -- The start-up result is done. >> Jan 21 16:10:24 linux7-7.mpc.local systemd[1]: Failed to register match >> for Disconnected message: Connection timed out >> Jan 21 16:10:24 linux7-7.mpc.local systemd[1]: Failed to initialize >> D-Bus connection: Connection timed out >> ... >> >> This then has a knock-on effect that causes other services to fail - e.g. >> >> -- Unit gdm.service has begun starting up. >> Jan 21 16:10:39 linux7-7.mpc.local dbus[817]: [system] Activating >> systemd to hand-off: service name='org.freedesktop.login1' >> unit='dbus-org.freedesktop.login1.service' >> Jan 21 16:10:50 linux7-7.mpc.local dbus[817]: [system] Failed to >> activate service 'org.freedesktop.systemd1': timed out >> Jan 21 16:10:50 linux7-7.mpc.local systemd-logind[1221]: Failed to >> enable subscription: Failed to activate service >> 'org.freedesktop.systemd1': timed out >> Jan 21 16:10:50 linux7-7.mpc.local systemd-logind[1221]: Failed to fully >> start up daemon: Connection timed out >> Jan 21 16:10:50 linux7-7.mpc.local systemd[1]: systemd-logind.service: >> main process exited, code=exited, status=1/FAILURE >> Jan 21 16:10:50 linux7-7.mpc.local systemd[1]: Failed to start Login >> Service. >> -- Subject: Unit systemd-logind.service has failed >> -- Defined-By: systemd >> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >> -- >> -- Unit systemd-logind.service has failed. >> -- >> -- The result is failed. >> >> Whatever the issue is, it appears that polkit might be involved - if we >> restart the polkit service, things appear to return to normal (e.g. gdm >> starts up etc) >> >> We can't find any similar reports of this happening elsewhere with >> CentOS 7.7 - but we were wondering if anyone else had come across a >> problem like this? > > I think the root of the problem is that there are missing definitions in > some of the systemd scripts. They allow things to work in 95% or greater > of the cases but this happens by chance, not because of perfect process > handling and system control. Small delays somewhere or uncommon system > environments then lead to intermittent failures which are difficult to > diagnose - at least for me. > > The good news is that you can just fiddle with the systemd scripts the > same way we fiddled with init scripts in the past. That way you can try > and error until you find a solution. Doesn't sound like being in full > control of things but better than not finding a solution at all. Yeah, we found that by introducing a small delay before the ExecStart in the dbus.service unit - even a delay of just 0.01 seconds (via 'ExecStartPre=/usr/bin/sleep 0.01') _seems_ to workaround the issue ... However, we would still like to know what the issue is and get a 'real' fix - I guess we could try creating a bug report with Redhat ... Thanks James Pearson