[CentOS] how can I stress a server?

Sun Nov 30 21:23:42 UTC 2008
Rudi Ahlers <rudiahlers at gmail.com>

On Fri, Nov 21, 2008 at 7:28 PM, William L. Maltby
<CentOS4Bill at triad.rr.com> wrote:
>
> On Fri, 2008-11-21 at 18:38 +0200, Rudi Ahlers wrote:
>> <snip>
>
>> I'm sitting with a very expensive paper weight right now, and I don't
>> know what todo. The same websites are running very well on a machine
>> with a Gigabyte G31MX-S motherboard + 4GB DDRII 800 RAM + C2D 6750
>> CPU. This is what baffles me, how can the same load on a slower
>> machine work fine, but on the faster one not?
>
> Having watched all this thread, I note that certain things are not
> mentioned. Assuming that you followed all the previous suggestions, I'll
> add my own that is based on practical experience some years back, and
> one recent experience.
>
> Like you, I always built my own. Since you have no way to check the PS,
> try removing all components you can and see if that helps. _Usually_ a
> weak PS will show symptoms on boot, since all things are spinning up
> asnd doing max current draw, but sometimes not. Some BIOS have settings
> that allow or automatically "spin up" in a stepped sequence. This would
> not stress the PS as much. Keep in mind that PS's have different
> amperage draw capabilities for different rails. A seemingly "sufficient"
> PS in terms of wattage may be weak on one or more of the rails. Specs
> for the mobo and PS might indicate a problem.

I also thought the problems was related to the power supply, but I
don't have a spare one of these at the moment. I did, however,
swap-out the PSU with a standard 350W PSU, and the sympoms were the
same, so it's not PSU related in this case. This also reminds me that
I should get a spare PSU ASAP :)
>
> Have you checked the voltage settings in the BIOS for the CPU and
> memory? Many/most these days automatically detect, but...

I normally leave those on automatic, since I don't like running
components outside suppliers specs.

>
> Check the spec sheets for the CPU and memory sticks.
>
> I recently upgraded a mobo memory and it would not boot or run reliably.
> The spec for the memory was not available and I left the settings as
> with the previous memory. Not wanting to fry the sticks and possibly
> void the warranty, I picked up the whole thing an carried it back to my
> local supplier. I explained the symptoms and told him I suspected memory
> voltage but didn't want to try/fry the sticks and risk the warranty.
>
> Hmmm... he said. Well, long story short, he eventually kicked up the
> voltage (I guess the "auto" in the BIOS was flaky or something) and all
> worked. Required +.2 volts. Most memory sticks can be run at slightly
> higher (+.1, +.2) volts without harm. Larger memory may require a slight
> increase in voltage. I guess the "automatic" settings can't always be
> trusted.
>
> Running about 6 months now, NPs.
>
> Another thing about pulling all components you can: if there is some
> kind of IRQ conflict, this can (used to?) cause slowdowns. Maybe that
> will be shown there. But that should also leave some traces in
> the /var/log/messages or dmesg log.

There no add-in cards, nor a CD-ROM / DVD-ROM, only the on-board
devices & the HDD's. Taking the HDD's out doesn't help much, since the
problems only occur when there's a bit of load on the system.

>
> Let's presume that the "obvious" problem is not the problem. What if it
> is not hardware directly?
>
> Examine your /var/log/dmesg carefully for any "suspect" messages. I've
> also found that occasionally drivers selected by the system may not be
> exactly correct. Check the specs for mobo and add-in cards and see if it
> looks like the best drivers for the chip sets are loaded (lsmod and
> modinfo help here).

/var/log/messages didn't show anything related to the problem, at all.

>
> Grab any old performance/diagnostics software (maybe some on this list
> have current knowledge - I don't) and run it. Compare to published data
> for same or similar systems.
>
> Enable sar on the system, run the reports and see where the slowdowns
> are.
>
> I haven't used multi-core yet, but I would first check to see if all the
> cores are being effectively used. Maybe top will help here? Not sure.
>
> BIOS: some have oddball (not really, but legacy issues abound) settings
> that may limit amount of memory seen/used? Keep an eye out for those.
> Memory timings may not be properly detected and set. Check the specs for
> the memory and see if the BIOS has them properly set. BTW, _some_ memory
> and mobo combos will allow faster settings, but be careful. I haven't
> dinked with them for a long time, so I can't make any Q & A suggestions.
>
> Have you upgraded to the latest BIOS on the system? Most retail mobos
> come with an early BIOS version that has... "issues". Check the
> manufacturers web site and see if there is a later BIOS.

No, I don't like BIOS upgrades unless absolutely necessary.

>
> OTHER: Of course, you have manually "re-seated" all connections, yes? A
> slightly loose cable, add-in card or memory not fully seated can do
> things such as you describe.

Other than RAM, the only other cables to reseat are the power cables &
SATA cables :)


>
> Visually inspect cables for "micro-fractures". Better, if you have
> access to meters, check for excessive resistance or opens. If not, try
> changing out cables. You might want to look in this area only if SAR
> reports show slow disk activity. Also hdparm might give some
> information. Maybe some settings there would help too.
>
> That's all I can think of ATM. I hope something of use here.
>
> --
> Bill
>
> _______________________________________________



But, I did make an interesting discovery, when I tried to install a
fesh copy of CentOS on a new HDD. The installation itself didn't
succeed. Everytime I had to choose an option, on any screen, during
installation, all the fans would spin up to it's max speed &
everything would be really slow. It's almost like trying to install
CentOS on a 486 computer. Yet, none of the heatsinks felt warm, even
as warm as the hard drives. So, I came to the conclusion that the
motherboard is faulty. Right now, I only have a spare Gigabyte
motherboard handy, which when I used it didn't give me any problems
whatsoever. I'm using the same 1U chassis with limited air flow and
small fans, and it runs as smooth as it should.

I have since swapped out the motherboard with the supplier, and the
new motherboard seems to run very well. Installation took about
20minutes to complete.


Thanx for all your help :)

-- 

Kind Regards
Rudi Ahlers