Martin Hewitt wrote:
Hi Mark,
I've exhausted the Java avenues for debugging this issue, but, since my last email, the process I pointed strace at has been killed, but I'm afraid the rather raw format of the strace file is lost on me. The last six lines of the ouput file are:
clone(child_stack=0x4202a250,
At a guess, looks like it's creating a child process. <snip>
futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL) = -1 EINTR (Interrupted system call) --- SIGHUP (Hangup) @ 0 (0) --- futex(0x2ab0b620a000, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn(0x2ab0b620a000) = -1 EINTR (Interrupted system call) futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL <unfinished ... exit status 129>
The SIGHUP is new information, and appears to be what's causing the java app to exit. Surely Java should be aware of the Interrupted system call?
There are no other signals in the output file, and the only EINTRs are in the passage above.
Does the exit status of 129 say anything other than SIGHUP?
Looks like I need to delve back into Java...
Yeah. I think you need to try what I was suggesting: start wrapping function calls in try/catch, and work your way down (when you find the one it fails in, then go into that function, er, method, and wrap the calls in there (and/or even put a writeln in a few choice spots, until you find the exact function the SIGHUP (or whatever) is happening in.
mark "why, yes, I *was* a developer longer than I've been an admin"
Martin
On 10 February 2011 19:37, m.roth@5-cent.us wrote:
Hey, Martin,
Martin Hewitt wrote:
Thanks, I didn't know about the strace command, so that's useful. Fortunately, this is on a dedicated server, so there's a fair amount of free disk.
<snip> If you can do the code changes (and the try/catch is *supposed* to be in there, according to java style), work your way down, y'know...
main
... try { First actual call to do the job } catch writeln error;
And if it fails there, then you know; otherwise, go to the next main call, sorry, "invocation of a method"....
Then again, this time in each of the main function calls under that, and step down until you find the function it's dying in. That'll give you a much better handle on what's happening.
Thanks for the help.
Good luck.
mark
Martin
On 10 February 2011 18:58, m.roth@5-cent.us wrote:
Martin Hewitt wrote:
Hi all,
I'm running CentOS 5.5 Final, Java version "1.6.0_17" OpenJDK Runtime Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-x86_64) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) installed via Yum.
We have a java application, packaged as a jar, running on our servers which, periodically, crawls RSS feeds and writes the articles to a database.
Randomly, and seemingly without cause, these processes will die, not through the application exiting, or due to my killing it, but due to something that seems to kill without leaving a trace.
<snip> The hard (but correct) way would be to put try {} catch in the code, and work your way down. Trying to debug it using a debugger might be real problematical, if you can't repeatably provoke it. I *suppose* you could attach strace to it, and dump the o/p into a file (on a filesystem with a *lot* of disk space)....
mark
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos