Fred Smith wrote: > > Hi all! > > I'm stuck on something really bizarre that is happening to a product > I "own" at work. It's a C program, built on CentOS, runs on CentOs or > RHEL, has been in circulation since the early 00's, is in use at > hundreds of sites. > > recently, at multiple customer sites it has started just going away. > no core file (yes, ulimit is configured), nothing in any of its > (several) log files. it's just gone. > > running it under strace until it dies reveals that every thread has > been given a SIGKILL. > > How does one figure out who deliverd a SIGKILL? For other, non-fatal, > signals it is possible to glean the PID of the sending process in a > signal handler, but obviously you can't do that for SIGKILL because > the app doesn't survive the signal. > > I'm grasping at straws here, and am open to almost any kind of > suggestion that can be followed-up (as compared to "beats me" which > is where I am now). > > I'm even wondering if systemd has something to do with it. I had an issue a few years ago where 'something' was killing processes - I found it by writing a simple LD_PRELOAD hack that intercepted kill(2) and logged what is was doing via syslog before doing the actual kill - and used /etc/ld.so.preload to get it loaded by every process ... James Pearson