I'm running with CentOS4 on a 2 cpu machine. I'm running a java process which creates a large number of threads as well as other native processes running under heavy load mixing 200+ audio streams. However, top shows none of these having any %CPU time. All of them read 0.0, except for top itself which occasionally shows up with 0.7, and migration/N with 0.3. All other processes read 0.0. The load average of the machine varies between 1.5 and 4.0.
Any suggestions on where to look for the problem would be appreciated,
Warren
On Thu, 2005-06-02 at 00:42, Warren Harris wrote:
I'm running with CentOS4 on a 2 cpu machine. I'm running a java process which creates a large number of threads as well as other native processes running under heavy load mixing 200+ audio streams. However, top shows none of these having any %CPU time. All of them read 0.0, except for top itself which occasionally shows up with 0.7, and migration/N with 0.3. All other processes read 0.0. The load average of the machine varies between 1.5 and 4.0.
Any suggestions on where to look for the problem would be appreciated,
What problem? Your system should be spending most of its time waiting for I/O to complete which doesn't show as CPU activity.
On Thu, 2005-06-02 at 00:42, Warren Harris wrote:
I'm running with CentOS4 on a 2 cpu machine. I'm running a java process which creates a large number of threads as well as other native processes running under heavy load mixing 200+ audio streams. However, top shows none of these having any %CPU time. All of them read 0.0, except for top itself which occasionally shows up with 0.7, and migration/N with 0.3. All other processes read 0.0. The load average of the machine varies between 1.5 and 4.0. Any suggestions on where to look for the problem would be appreciated,
On Thu, 2005-06-02 at 01:08 -0500, Les Mikesell wrote:
What problem? Your system should be spending most of its time waiting for I/O to complete which doesn't show as CPU activity.
Exactly.
With 200+ audio streams, it's probably a safe bet that I/O is being slaughtered. Even "vmstat" won't show those, unless a lot of paging is going on due to insufficient memory. "iostat" and similar tools are what you are interested in.
In reality, I've been meaning to look into what I/O-interconnect monitoring/stat tools the Linux kernel is capable of. It's very, very difficult to gage the amount of I/O Linux can push through systems that have partial-mesh interconnects (e.g., Opteron).
Some of the newer benchmarks of Linux and Solaris on Opteron aren't putting a good face on Linux and it's handling of I/O. It's not really Linux's fault though. The PC platform has generally been an extremely poor platform for I/O for so long, with a single point of CPU-memory-I/O contention traditionally at the Intel Memory Controller Hub (MCH) aka "Front Side Bottleneck," so it's really a matter of exposure.
Most RISC/UNIX platforms have it (and lots of tools for monitoring it). Lintel has traditionally not.