[CentOS] Cron script crashing server...

Tue Oct 4 12:35:42 UTC 2005
Mark Belanger <mark_belanger at ltx.com>

Ian mu wrote:
 > Hiya, redid the crontab file and still crashed, had the strace running
 > on it and compared the run without cron to the run from cron. It does
 > actually run the stats script so was misleading before, so I'm guessing
 > previously just hadn't got as far as outputting it to file or something.

Hmm,,, if you're echoing in the stats script it should show up
some place.

 > Crashing server from cron strace, 10297 is the script, 10296 is a mrtg
 > running from a different account which runs fine whatever...

First off, I'd remove mrtg from the equation(cron) and strip it down
to just your program.

I guess then I'd focus on the portion of your script that
follows "### MODS ##################".

You might try different strace flags - make sure you are following
all forks.

One trick I've used before is to replace the call to
the perl program in the stats script to open up the program
inside the perl debugger.  For instance, if the stats
script is running
/usr/local/bin/myscript.pl
You could change it to:
typeset -x DISPLAY=mydisplay:0
xterm -e 'perl -d /usr/local/bin/myscript.pl'

Once it comes up, you can just continue to see if the
crash happens and if so you could step through the code
until you find the offending lines.  If you're a "debug with print"
kind of guy you could still do the xterm trick but just insert
sleeps and prints into the perl program to narrow down
the exact line that is causing the problem.  After realize how
much time was wasted with print statements, you can spend a couple
of hours learning the perl debugger :)

I'm afraid I'm running out of helpful(and intelligent)
feedback on this problem.  One thing I did notice, the mrtg script
seemed to go through a fair amount of gyrations to find
integer.pm - though it's probably not related to the problem.

 >
 > 10297 read(4, ",\n            \'*\' => \'ae181a\',\n "..., 4096) = 4096
 > 10297 read(4, ">[0] eq \'COMT\' and $r >= $floor;"..., 4096) = 4096
 > 10297 brk(0)                            = 0x870a000
 > 10297 brk(0x872b000)                    = 0x872b000
 > ENOENT (No such file or directory)
 > -1 ENOENT (No such file or directory)
 > file or directory)
 > such file or directory)
 > 0xbfff9ba0) = -1 ENOENT (No such file or directory)
 > O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
 > (No such file or directory)
 > (Inappropriate ioctl for device)
 > 10297 read(4,  <unfinished ...>
 > 10297 <... read resumed> "\n}\n\n### MODS ###################"..., 4096)
 > = 3988
 > = 3266
 > 10297 read(4, "", 4096)                 = 0
 > 10297 close(4)                          = 0
 > 10297 open("", O_RDONLY|O_LARGEFILE <unfinished ...>
 > <strace stopped here>

-Mark


-- 
Mark Belanger
LTX Corporation