[CentOS] CentOS 5.5 Java Process Death

Fri Feb 11 08:53:42 UTC 2011
Martin Hewitt <martin.hewitt at gmail.com>

Hi Keith,

Interesting idea, I've built the Sun SDK on one server, and left the yum-installed version on the other, and have started the same java application on both servers with strace, so I'll see if there's any difference.

Thanks for all the help,

Martin

On 11 Feb 2011, at 07:05, Keith Roberts wrote:

> On Fri, 11 Feb 2011, Martin Hewitt wrote:
> 
>> To: CentOS mailing list <centos at centos.org>
>> From: Martin Hewitt <martin.hewitt at gmail.com>
>> Subject: Re: [CentOS] CentOS 5.5 Java Process Death
>> Hi Mark,
>> 
>> I've exhausted the Java avenues for debugging this issue, but, since
>> my last email, the process I pointed strace at has been killed, but
>> I'm afraid the rather raw format of the strace file is lost on me.
>> The last six lines of the ouput file are:
> 
> Do you have different versions of JAVA from different vendors installed? I don't use Iced Tea as it's not always 100% compatible. Try to use just *one* vendor's version of JAVA as your active JAVA installation. I only use Sun's SDK as I have noticed problems using other vendors versions.
> 
> HTH
> 
> Keith Roberts
> 
> 
>> clone(child_stack=0x4202a250,
>> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
>> parent_tidptr=0x4202a9d0, tls=0x4202a940, child_tidptr=0x4202a9d0) =
>> 23241
>> futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL) = -1 EINTR (Interrupted system call)
>> --- SIGHUP (Hangup) @ 0 (0) ---
>> futex(0x2ab0b620a000, FUTEX_WAKE_PRIVATE, 1) = 1
>> rt_sigreturn(0x2ab0b620a000)            = -1 EINTR (Interrupted system call)
>> futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL <unfinished ... exit status 129>
>> 
>> The SIGHUP is new information, and appears to be what's causing the
>> java app to exit. Surely Java should be aware of the Interrupted
>> system call?
>> 
>> There are no other signals in the output file, and the only EINTRs are
>> in the passage above.
>> 
>> Looks like I need to delve back into Java...
>> 
>> Martin
>> 
>> On 10 February 2011 19:37,  <m.roth at 5-cent.us> wrote:
>>> Hey, Martin,
>>> 
>>> Martin Hewitt wrote:
>>>> 
>>>> Thanks, I didn't know about the strace command, so that's useful.
>>>> Fortunately, this is on a dedicated server, so there's a fair amount
>>>> of free disk.
>>> <snip>
>>> If you can do the code changes (and the try/catch is *supposed* to be in
>>> there, according to java style), work your way down, y'know...
>>> 
>>> main
>>> 
>>> ...
>>> try {
>>> First actual call to do the job
>>> } catch
>>>   writeln error;
>>> 
>>> And if it fails there, then you know; otherwise, go to the next main call,
>>> sorry, "invocation of a method"....
>>> 
>>> Then again, this time in each of the main function calls under that, and
>>> step down until you find the function it's dying in. That'll give you a
>>> much better handle on what's happening.
>>> 
>>>> Thanks for the help.
>>>> 
>>> Good luck.
>>> 
>>>        mark
>>>> Martin
>>>> 
>>>> On 10 February 2011 18:58,  <m.roth at 5-cent.us> wrote:
>>>>> Martin Hewitt wrote:
>>>>>> Hi all,
>>>>>> 
>>>>>> I'm running CentOS 5.5 Final, Java version "1.6.0_17" OpenJDK Runtime
>>>>>> Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-x86_64) OpenJDK 64-Bit
>>>>>> Server VM (build 14.0-b16, mixed mode) installed via Yum.
>>>>>> 
>>>>>> We have a java application, packaged as a jar, running on our servers
>>>>>> which, periodically, crawls RSS feeds and writes the articles to a
>>>>>> database.
>>>>>> 
>>>>>> Randomly, and seemingly without cause, these processes will die, not
>>>>>> through the application exiting, or due to my killing it, but due to
>>>>>> something that seems to kill without leaving a trace.
>>>>> <snip>
>>>>> The hard (but correct) way would be to put try {} catch in the code, and
>>>>> work your way down. Trying to debug it using a debugger might be real
>>>>> problematical, if you can't repeatably provoke it. I *suppose* you could
>>>>> attach strace to it, and dump the o/p into a file (on a filesystem with
>>>>> a
>>>>> *lot* of disk space)....
>>>>> 
>>>>>        mark
>>>>> 
>>>>> _______________________________________________
>>>>> CentOS mailing list
>>>>> CentOS at centos.org
>>>>> http://lists.centos.org/mailman/listinfo/centos
>>>>> 
>>>> _______________________________________________
>>>> CentOS mailing list
>>>> CentOS at centos.org
>>>> http://lists.centos.org/mailman/listinfo/centos
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> CentOS mailing list
>>> CentOS at centos.org
>>> http://lists.centos.org/mailman/listinfo/centos
>>> 
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>> 
> 
> -- 
> -----------------------------------------------------------------
> Websites:
> http://www.karsites.net
> http://www.php-debuggers.net
> http://www.raised-from-the-dead.org.uk
> 
> All email addresses are challenge-response protected with
> TMDA [http://tmda.net]
> -----------------------------------------------------------------_______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos