Hi all!
I'm stuck on something really bizarre that is happening to a product
I "own" at work. It's a C program, built on CentOS, runs on CentOs or
RHEL, has been in circulation since the early 00's, is in use at
hundreds of sites.
recently, at multiple customer sites it has started just going away.
no core file (yes, ulimit is configured), nothing in any of its
(several) log files. it's just gone.
running it under strace until it dies reveals that every thread has
been given a SIGKILL.
How does one figure out who deliverd a SIGKILL? For other, non-fatal,
signals it is possible to glean the PID of the sending process in a
signal handler, but obviously you can't do that for SIGKILL because
the app doesn't survive the signal.
I'm grasping at straws here, and am open to almost any kind of
suggestion that can be followed-up (as compared to "beats me" which
is where I am now).
I'm even wondering if systemd has something to do with it.
Thanks in advance!
--
---- Fred Smith -- fredex(a)fcshome.stoneham.ma.us -----------------------------
But God demonstrates his own love for us in this:
While we were still sinners,
Christ died for us.
------------------------------- Romans 5:8 (niv) ------------------------------