[CentOS] Strange performance issues under CentOS 5.1

Wed Feb 13 02:57:21 UTC 2008
William L. Maltby <CentOS4Bill at triad.rr.com>

On Tue, 2008-02-12 at 21:15 -0500, Alfred von Campe wrote:
> I am still running CentOS 4.6 on our production systems, but I am  
> starting to plan the upgrade to CentOS 5.1.  I have one test system  
> running 5.1 that is the exact same hardware configuration as my 4.6  
> test system.  One of our builds runs about 6 times slower on the 5.1  
> system, even though is uses less overall CPU time.  I first suspected  
> something wrong with the disk, but the results from bonnie++ show  
> that the 5.1 system is slightly faster:
> 
>    Version  1.03       ------Sequential Output------ --Sequential  
> Input- --Random-
>                        -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
> Block-- --Seeks--
>    Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
> CP  /sec %CP
>    centos4.6       16G           35933  10 21301   5            
> 46507   6  41.8   0
> 
> 
>    Version  1.03       ------Sequential Output------ --Sequential  
> Input- --Random-
>                        -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
> Block-- --Seeks--
>    Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
> CP  /sec %CP
>    centos5.1       16G           42015  14 21179   5            
> 49863   4  91.6   0
> 
> Then I ran the build with "/usr/bin/time --verbose", and here are the  
> results (first 4.6 then 5.1):
> 
>          Command being timed: "make"
>          User time (seconds): 32.15
>          System time (seconds): 3.52
>          Percent of CPU this job got: 99%
>          Elapsed (wall clock) time (h:mm:ss or m:ss): 0:35.88
> 
>          Command being timed: "make"
>          User time (seconds): 22.05
>          System time (seconds): 3.11
>          Percent of CPU this job got: 11%
>          Elapsed (wall clock) time (h:mm:ss or m:ss): 3:31.35
> 
> As you can see from the above, there is a lot of idle time on the 5.1  
> system.  Finally, I ran the build with "strace -c", and here are the  
> top ten lines of that output (again, 4.6 first and then 5.1):
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>   53.81   16.804147       54916       306        58 waitpid
>   34.75   10.853461       82851       131           wait4
>    5.29    1.650844           9    177706    154581 open
>    1.61    0.503701          15     34408           read
>    0.91    0.283706          15     18607           write
>    0.60    0.185894          12     14919     10364 stat64
>    0.52    0.163340          10     16495      9079 access
>    0.47    0.146933           7     20581           mmap2
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>   60.07   15.173924       52687       288        58 waitpid
>   38.50    9.724412       83831       116           wait4
>    0.54    0.135194           7     19199     10705 access
>    0.36    0.090850          54      1681      1334 execve
>    0.27    0.067686           5     14423     10570 stat64
>    0.11    0.027676           1     24832           read
>    0.09    0.022339           0    155810    135765 open
>    0.03    0.007617         159        48           unlink
> 
> Any suggestions as to what could possible be causing this?  I am  
> fresh out of other ideas to try.

Check BIOS settings? For memory, CAS etc. the same? Disk hardware the
same and specified identically?

Presumming that nothing is found there, install system accounting
packages and run some SAR reports. You may see a clue in them.

Any "tweaks" on the old system you forgot to apply on the new? Elevator,
buffer flush interval changes, etc?

Any other noticeable things on there that may cause it? Presume the
slowdown is caused by a process that you are not looking at. "Hangs"
while some other process is waiting or tying up the CPU. Try running
top.

I notice an execve shows on the new one that is not in the old. One says
"hmmm....".

What does swapon -s show?

Is the system "seeing" the same amount of memory "available" or have
BIOS settings in one reduced available?

If all new equipment on the new one, open her up and reseat all
connections, PCI cards and mem sticks. Make sure all power connectors
are well seated to MB and drives.

Front side bus and memory speeds set the same in BIOS?

That's all I can think of that may be even remotely related ATM

> 
> Alfred
> <snip sig stuff>

HTH
-- 
Bill