On an up-to-date CentOS 7 system, I am running named-sdb (pulling domain records from MySQL), which is segfaulting randomly (after 3-8 hours or so it appears) in libmysqlclient (I've opened a bug).
Since this is an internal service, until the segfault can be addressed, I wanted systemd to restart it for me. I created a file called /etc/systemd/system/named-sdb-chroot.service.d/service.conf with:
[Service] Restart=always
in it. I did "systemctl daemon-reload" and restarted the service, but systemd is not restarting it. It knows it died, because that's in the log (and it ran the chroot cleanup service):
Aug 25 12:03:39 dnssql systemd: named-sdb-chroot.service: main process exited, code=killed, status=11/SEGV Aug 25 12:03:39 dnssql systemd: named-sdb-chroot.service: control process exited, code=exited status=1 Aug 25 12:03:39 dnssql systemd: Unit named-sdb-chroot.service entered failed state. Aug 25 12:03:39 dnssql systemd: named-sdb-chroot.service failed. Aug 25 12:03:39 dnssql systemd: Stopping Set-up/destroy chroot environment for named-sdb... Aug 25 12:03:39 dnssql systemd: Stopped Set-up/destroy chroot environment for named-sdb.
I see the file listed in the output of systemd-delta, so I know it is being seen. Any idea why systemd isn't restarting it?
Once upon a time, Chris Adams linux@cmadams.net said:
Since this is an internal service, until the segfault can be addressed, I wanted systemd to restart it for me. I created a file called /etc/systemd/system/named-sdb-chroot.service.d/service.conf with:
[Service] Restart=always
<snip>
I see the file listed in the output of systemd-delta, so I know it is being seen. Any idea why systemd isn't restarting it?
Just for follow-up: the problem is apparently because of how systemd handles the interaction between named-sdb-chroot.service and named-sdb-chroot-setup.service - something about that (maybe the BindsTo in the -setup.service) keeps systemd from restarting the service.
For this particular setup, the chroot didn't really add significant security, so I dropped it for the "regular" named-sdb.service, and systemd does restart that on failure as expected.
Now back to the real bug - why does named-sdb segfault in libmysqlclient.so after a few hours...
On Aug 25, 2016, at 1:43 PM, Chris Adams linux@cmadams.net wrote:
named-sdb (pulling domain records from MySQL), which is segfaulting randomly
named-sdb used to be(~7+ years ago) single threaded only and would crash if threads were enabled. Did you change named to NOT thread? Does named-sdb still do single threaded only?
Once upon a time, Steven Tardy sjt5atra@gmail.com said:
On Aug 25, 2016, at 1:43 PM, Chris Adams linux@cmadams.net wrote:
named-sdb (pulling domain records from MySQL), which is segfaulting randomly
named-sdb used to be(~7+ years ago) single threaded only and would crash if threads were enabled. Did you change named to NOT thread? Does named-sdb still do single threaded only?
I'm just using it "out of the box" from the CentOS 7 bind-sdb RPM.
It was running okay, but the system hadn't been updated in a while (again, internal system), and after it was updated this week, named started crashing. I don't know if it's a problem on the BIND side or the MariaDB side - both were updated:
bind-sdb: 9.9.4-14.el7_0.1 -> 9.9.4-29.el7_2.3 mariadb-libs: 5.5.40-2.el7_0 -> 5.5.50-1.el7_2
Given the changes in bind-sdb are fairly minor (CVE patches), while mariadb-libs rebased to a new release (granted, minor version change), I'm more inclined to think that's where the problem lies, although it could just be a combination of the two.
Now that I think about it though - the VM running this was also updated from 1 CPU core to 2, so named-sdb probably went from one thread to two... maybe you're right. I'll check that out.