I'm running a server recently installed with centos 4.2.
It's running the kernel in the subject line, on an PIII 866Mhz with
512mb ram.
The system is running basically two processes:
1) ssh to remote system, receiving a stream of bytes
piped into
2) gzip the stream, write to disk file
The system is slow, relatively (866Mhz cpu) and the network is fast
(gigabit) so the limiting factor should be CPU.
And it is.
However, at times, the system gets into a 'weird' state where instead of
using about 85% user/ 15% system, it goes to 50% user and 50% system.
Now 50% system time for this load is ridiculous, and as I said before,
most of the time it is 85/15 and only occasionally get's 'stuck' in
50/50.
So I got oprofile running to find out what part of the kernel it is
stuck in, and here is the output of opreport -l vmlinux (top scores
only):
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % symbol name
21117 36.5340 default_idle
3850 6.6608 __copy_from_user_ll
3108 5.3771 find_get_page
1707 2.9532 __copy_user_intel
1580 2.7335 __might_sleep
1356 2.3460 handle_IRQ_event
1323 2.2889 __copy_to_user_ll
1168 2.0207 __find_get_block_slow
1134 1.9619 __find_get_block
1093 1.8910 finish_task_switch
963 1.6661 bh_lru_install
638 1.1038 __wake_up
605 1.0467 __do_softirq
This is CRAZY! How can default_idle be sucking away cycles when the
system is (should be) cpubound?
Can anyone explain this?
David