[CentOS] Weird performance problem

Fri Apr 17 15:10:00 UTC 2009
JohnS <jses27 at gmail.com>

On Thu, 2009-04-16 at 09:12 -0400, Ugo Bellavance wrote:
> Hi,
> 
> I'm running a CentOS 4.  server and I sometimes face a weird problem. 
> It is a weird performance problem, and here is how I discovered it.
> 
> This server runs OpenVZ virtual machines, and one of them is an asterisk 
> server for my personal use.  The first symptom of the problem is that 
> the voice quality became flaky.  So I logged on the server to see what 
> could be eating cpu cycles, when I ran top, it took almost one minute 
> before top actually showed.  Another hint is that when I run dstat (a 
> monitoring utility that is a mix of iostat and vmstat and other stats), 
> I often get a "missed xx ticks", where xx is a number.
> 
> Example (current) (sorry for the wrap):
> 
> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
>    3   2  93   2   0   0| 106k  273k|   0     0 | 0.2   0.4 |1039   389
>    3   6  91   0   0   0|   0  6416k| 276k  275k|   0     0 |2160  6822 
>   missed 55 ticks
>    4  10  84   2   0   0|1200k 1992k|  82k   93k|   0     0 |1188  6275 
>   missed 29 ticks
>    1   0  99   0   0   0|   0  1312k|  65k   66k|   0     0 |1050  1114 
>   missed 38 ticks
>    2   1  96   0   0   0|   0  1168k|  57k   59k|   0     0 | 491   877 
>   missed 13 ticks
>    3   1  94   1   0   0|   0  6016k| 181k  176k|   0     0 |2169  5996 
>   missed 50 ticks
>    4   2  91   1   0   0|  28k 8744k| 216k  214k|   0     0 |2159  5438 
>   missed 37 ticks
>    1   1  98   0   0   0|   0  2632k|  93k   91k|   0     0 | 983  1381 
>   missed 34 ticks
>    1   1  98   1   0   0|   0  5624k| 113k  110k|   0     0 |1569  2643 
>   missed 52 ticks
>    1   1  98   1   0   0|   0  2432k|  29k   28k|   0     0 | 679   647 
>   missed 12 ticks
>    0   0 100   0   0   0|   0     0 |  60B  374B|   0     0 |  13    15
>    2   3  94   0   0   0|   0  1872k| 209k  210k|   0     0 |1375  3590 
>   missed 30 ticks
> 
> 
> 
> The problem is currently occuring, but it doesn't seem to be affecting 
> voice quality for now, so I have some time to try to find the cause. 
> The only solution I've found up to now is to reboot... But hey, this 
> isn't a Windows 98 machine :)!
> 
> I tried restarting the VZ system, which restarts all the VMs, but it 
> didn't solve the problem.  I can't tell if the problem occurs on a stock 
> centos kernel, because the server is running production (but 
> non-critical) virtual machines, so it is always running the openVZ kernel.
> 
> So here is what I've done for now:
> 
> - Top shows a load of about 0.4
> 
> - vmstat 1 10 shows this:
> 
> procs -----------memory---------- ---swap-- -----io---- --system-- 
> ----cpu----
>   r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us 
> sy id wa
>   0  0    592 191092 381720 537956    0    0    53    68    4     3  3 
> 2 93  2
>   0  0    592 190720 381720 537956    0    0     0     0   32    60  1 
> 1 98  0
>   0  0    592 191092 381720 537956    0    0     0     0   41    59  0 
> 0 100  0
>   1  0    592 191092 381728 537948    0    0     0  2584  311    96 10 
> 4 66 19
>   0  0    592 189968 381732 537944    0    0     0  2080  222   174  2 
> 3 79 16
>   0  1    592 189968 381732 537944    0    0     0  3244  170    73 10 
> 4 73 12
>   0  0    592 190216 381732 537944    0    0     0   136   76   113  1 
> 2 93  4
>   0  0    592 189844 381732 537944    0    0     0     0   33    69  1 
> 1 98  0
>   0  0    592 189844 381732 537944    0    0     0     0   24    32  0 
> 0 100  0
>   0  0    592 190340 381732 537944    0    0     0     0   28    42  0 
> 0 100  0
> 
> iostat -x 1 (excerpt)
> 
> Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda          0.00 171.00  0.00 124.00    0.00 2368.00     0.00  1184.00 
>     19.10     0.14    1.13   0.02   0.20
> sdb          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> sdc          0.00 171.00  0.00 124.00    0.00 2368.00     0.00  1184.00 
>     19.10     0.17    1.35   0.02   0.30
> sdd          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> md0          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> md2          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> md1          0.00   0.00  0.00 294.00    0.00 2352.00     0.00  1176.00 
>      8.00     0.00    0.00   0.00   0.00
> dm-0         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> dm-1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> dm-2         0.00   0.00  0.00 294.00    0.00 2352.00     0.00  1176.00 
>      8.00     0.30    1.01   0.02   0.50
> dm-3         0.00   0.00  0.00 294.00    0.00 2352.00     0.00  1176.00 
>      8.00     0.30    1.01   0.02   0.50
> dm-4         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> dm-5         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> dm-6         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00 
>     0.00     0.00    0.00   0.00   0.00
> 
> Any insight would be greatly appreciated.  It is not critical, but I'd 
> be glad to be able to finally pinpoint and solve the problem.
> 
> Hardware: HP Netserver, software raid, SCSI disks, 1.7 GB RAM.
> 
> I can provide more information if needed.
> 
> Thanks,
> 
> Ugo
-----
That's a known problem with the Kernel and VM Kernel. You need the fixed
kernel.