Am 11.05.2017 um 20:30 schrieb Larry Martell: > On Wed, May 10, 2017 at 3:19 PM, Larry Martell <larry.martell at gmail.com> wrote: >> On Wed, May 10, 2017 at 3:07 PM, Jonathan Billings <billings at negate.org> wrote: >>> On Wed, May 10, 2017 at 02:40:04PM -0400, Larry Martell wrote: >>>> I have a CentOS 7 system that I run a home grown python daemon on. I >>>> run this same daemon on many other systems without any incident. On >>>> this one system the daemon seems to die or be killed every day around >>>> 3:30am. There is nothing it its log or any system logs that tell me >>>> why it dies. However in /var/log/messages every day I see something >>>> like this: >>> >>> How are you starting this daemon? >> >> I am using code something like this: https://gist.github.com/slor/5946334. >> >>> Can you check the journal? Perhaps >>> you'll see more useful information than what you see in the syslogs? >> >> Thanks, I will do that. > > Thank you for that suggestion. I was able to get someone to run > journalctl and send me the output and it was very interesting. > > First, there is logging going on continuously during the time when > logging stops in /var/log/messages. > > Second, I see messages like this periodically: > > May 10 03:57:46 localhost.localdomain python[40222]: detected > unhandled Python exception in > '/usr/local/motor/motor/core/data/importer.py' > May 10 03:57:46 localhost.localdomain abrt-server[40277]: Only 0MiB is > available on /var/spool/abrt > May 10 03:57:46 localhost.localdomain python[40222]: error sending > data to ABRT daemon: > > This happens at various times of the day, and I do not think is > related to the daemon crashing. > > But I did see one occurrence of this: > > May 09 03:49:35 localhost.localdomain python[14042]: detected > unhandled Python exception in > '/usr/local/motor/motor/core/data/importerd.py' > May 09 03:49:35 localhost.localdomain abrt-server[22714]: Only 0MiB is > available on /var/spool/abrt > May 09 03:49:35 localhost.localdomain python[14042]: error sending > data to ABRT daemon: > > And that is the daemon. But I only see that on this one day, and it > crashes every day. > > And I see this type of message frequently throughout the day, every day: > > May 09 03:40:01 localhost.localdomain CROND[21447]: (motor) CMD > (python /usr/local/motor/motor/scripts/image_mover.py -v1 -d > /usr/local/motor/data > ~/last_image_move_log.txt) > May 09 03:40:01 localhost.localdomain abrt-server[21453]: Only 0MiB is > available on /var/spool/abrt > May 09 03:40:01 localhost.localdomain python[21402]: error sending > data to ABRT daemon: > May 09 03:40:01 localhost.localdomain postfix/postdrop[21456]: > warning: uid=0: No space left on device > May 09 03:40:01 localhost.localdomain postfix/sendmail[21455]: fatal: > root(0): queue file write error > May 09 03:40:01 localhost.localdomain crond[2630]: postdrop: warning: > uid=0: No space left on device > May 09 03:40:01 localhost.localdomain crond[2630]: sendmail: fatal: > root(0): queue file write error > May 09 03:40:01 localhost.localdomain CROND[21443]: (root) MAIL > (mailed 67 bytes of output but got status 0x004b) > > So it seems there is a space issue. > > And finally, coinciding with the time that the logging resumes in > /var/log/messages I see this every day at that time: > > May 10 03:57:57 localhost.localdomain > run-parts(/etc/cron.daily)[40293]: finished mlocate > May 10 03:57:57 localhost.localdomain anacron[33406]: Job `cron.daily' > terminated (mailing output) > May 10 03:57:57 localhost.localdomain anacron[33406]: Normal exit (1 job run) > > I need to get my remote hands to get me more info. df -hT; df -i There is no space left on a vital partition / logical volume. "Only 0MiB is available on /var/spool/abrt" "postdrop: warning: uid=0: No space left on device" Alexander