On Fri, Mar 11, 2011 at 10:06 AM, <m.roth at 5-cent.us> wrote: > PJ wrote: >> This may or may not be CentOS related, but am out of ideas at this point > and wanted to bounce this off the list. >> >> I'm running a CentOS 5.5 server, running the latest kernel >> 2.6.18-194.32.1.el5. >> >> Almost everyday around 3:30 AM the server completely locks up and has to > be power cycled before it will come back online. >> (this means someone hat to wake up and reboot the server, oh how I love > being an internet janitor! :) > > Please log of the Internet. We are cleaning it. You may log back on later. > > <snip> >> I was able to pull this from /var/log/messages, this happens just > seconds before locking up completely... >> >> Mar 8 03:33:18 web1 kernel: INFO: task wget:13608 blocked for more than > 120 seconds. >> Mar 8 03:33:19 web1 kernel: "echo 0 > >> /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Mar 8 03:33:19 web1 kernel: wget D ffff810001004420 0 > 13608 13607 (NOTLB) >> Mar 8 03:33:19 web1 kernel: ffff81007bc7bc78 0000000000000086 >> ffff81007bc7bd88 ffff81000100d3f8 >> Mar 8 03:33:19 web1 kernel: ffff81007bc7bbf0 0000000000000007 >> ffff8100849db0c0 ffffffff80308b60 >> Mar 8 03:33:19 web1 kernel: 00013a2964cdf439 0000000000003237 >> ffff8100849db2a8 0000000064c82eae >> Mar 8 03:33:19 web1 kernel: Call Trace: >> Mar 8 03:33:20 web1 kernel: [<ffffffff80063c6f>] >> __mutex_lock_slowpath+0x60/0x9b > <snip> > Anyone else smell an OOM killer? But it's clearly whatever the wget's > after that's killing the system. > > mark > > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos > What makes no sense to me is this runs every 5 minutes all day, but only around 3:30 AM does it look up. There is nothing in the log that suggests the kernel is having to kill processes because it is out of resources. No "httpd invoked oom-killer" etc... which I have seen before in other situations. http://bugs.centos.org/view.php?id=4515 sounds like what I have going on, but not with kjournald of course... Thanks, -- PJ