On Fri, 2005-11-25 at 10:58 -0500, Mark Belanger wrote:
Sam Drinkard wrote:
> List,
> 
>    I've got a process(s) that have been exiting on a signal 11, but not 
> all the time.  I suspect it's memory related, and I was wondering, is 
> there any way to tell exactly how much memory a particular process has 
> in use at the point it sigsev's ?  I can't sit here and watch top or the 
> system monitor, but thought maybe something might be saved somewhere 
> after the fact.  The processes are using at the moment, 1.4gb of memory 
> and no swap.  I've not seen swap go active during any of this process 
> run, and the machine has 2 GB memory installed.  Rather hard to trace 
> down the exact activity when it happens.

After the process starts, strace it:
strace -p `ps -ef |grep YourProc |grep -v grep | awk '{print $1}'`

-Mark

Not sure if this will be of any use -- sure isn't any use to me as I don't even begin to understand, but here's the output from the strace... It did not attach to the wrf.exe process which I wanted, but it attached to the mpi process which was running the wrf.exe, at least one of the processes.  There are two at runtime.

Sam

[rob@thunder static]$ strace -p `ps -ef|grep wrf.exe|grep -v grep | awk '{print $2}'`
Process 1503 attached - interrupt to quit
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1583
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, 0x7fbfffe5c4, WNOHANG, NULL)  = -1 ECHILD (No child processes)
rt_sigreturn(0xffffffffffffffff)        = 0
rt_sigaction(SIGINT, {SIG_DFL}, {0x432b60, [], SA_RESTORER, 0x2a955a6280}, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
stat("/bin/rm", {st_mode=S_IFREG|0755, st_size=41168, ...}) = 0
access("/bin/rm", X_OK)                 = 0
rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
fork()                                  = 1672
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {0x432b60, [], SA_RESTORER, 0x2a955a6280}, {SIG_DFL}, 8) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1672
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, 0x7fbfffe7a4, WNOHANG, NULL)  = -1 ECHILD (No child processes)
rt_sigreturn(0xffffffffffffffff)        = 0
rt_sigaction(SIGINT, {SIG_DFL}, {0x432b60, [], SA_RESTORER, 0x2a955a6280}, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
munmap(0x2a95557000, 4096)              = 0
exit_group(0)                           = ?
Process 1503 detached