[CentOS] ps locking up

Thu Oct 27 14:01:40 UTC 2011
Johnny Hughes <johnny at centos.org>

On 10/27/2011 07:58 AM, James Shupe wrote:
> I have a client running a CentOS 6.0 machine with cPanel. The machine is
> fully updated with both cPanel (RELEASE) and the OS.
> 
> At first, I noticed that after cPanel's dcpumon ran (even once),
> applications that depend on ps lock up and iowait jumps to around 50%.
> Load averages start out around 20 when this happens and slowly crawl up
> into the hundreds. Aside from not being able to run commands like ps and
> some nrpe scripts, everything still seems to respond fine even with the
> insanely high load. We've had it online with customers hitting it with a
> load of 400 waiting for a convenient time to reboot, without complaints.
> 
> Clarification: If you run ps, it kills your terminal session. dcpumon,
> ps, etc, will hang around and you can see them under top (top doesn't
> seem to be affected.) If you try to kill any of these, (-9, anything)
> they do not respond. They're indefinitely blocked. They begin producing
> "processes being blocked for more than 120 seconds errors" in the logs.
> The server runs for days between this happening without issues and it
> always seems to be related to dcpumon.
> 
> I wrote a script that checks to see if dcpumon exists in crontab and
> remove it. The script runs every minute. cPanel's automatic updates tend
> to put it back every once in a while, and it's possible that updates ran
> and that dcpumon ran before my script could remove it. I see that it
> removed it last night (it logs removals) but don't know for sure if it
> ran. It probably did.
> 
> It's currently running 2.6.32-71.29.1.el6.x86_64 and I am considering
> trying vanilla kernel build to see if that corrects the issues. The
> hardware is HP DL145G3, and we have several other (non-cPanel) servers
> that are identical running CentOS 6.0 without issue.
> 
> Any ideas?

Since you are using cPanel, open up a trouble ticket with them and have
them take a look at it.

They are usually very responsive to problems like this and may have seen
this before.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <http://lists.centos.org/pipermail/centos/attachments/20111027/bdcb1fa9/attachment-0003.sig>