On Fri, Nov 21, 2008 at 7:28 PM, William L. Maltby CentOS4Bill@triad.rr.com wrote:
On Fri, 2008-11-21 at 18:38 +0200, Rudi Ahlers wrote:
<snip>
I'm sitting with a very expensive paper weight right now, and I don't know what todo. The same websites are running very well on a machine with a Gigabyte G31MX-S motherboard + 4GB DDRII 800 RAM + C2D 6750 CPU. This is what baffles me, how can the same load on a slower machine work fine, but on the faster one not?
Having watched all this thread, I note that certain things are not mentioned. Assuming that you followed all the previous suggestions, I'll add my own that is based on practical experience some years back, and one recent experience.
Like you, I always built my own. Since you have no way to check the PS, try removing all components you can and see if that helps. _Usually_ a weak PS will show symptoms on boot, since all things are spinning up asnd doing max current draw, but sometimes not. Some BIOS have settings that allow or automatically "spin up" in a stepped sequence. This would not stress the PS as much. Keep in mind that PS's have different amperage draw capabilities for different rails. A seemingly "sufficient" PS in terms of wattage may be weak on one or more of the rails. Specs for the mobo and PS might indicate a problem.
I also thought the problems was related to the power supply, but I don't have a spare one of these at the moment. I did, however, swap-out the PSU with a standard 350W PSU, and the sympoms were the same, so it's not PSU related in this case. This also reminds me that I should get a spare PSU ASAP :)
Have you checked the voltage settings in the BIOS for the CPU and memory? Many/most these days automatically detect, but...
I normally leave those on automatic, since I don't like running components outside suppliers specs.
Check the spec sheets for the CPU and memory sticks.
I recently upgraded a mobo memory and it would not boot or run reliably. The spec for the memory was not available and I left the settings as with the previous memory. Not wanting to fry the sticks and possibly void the warranty, I picked up the whole thing an carried it back to my local supplier. I explained the symptoms and told him I suspected memory voltage but didn't want to try/fry the sticks and risk the warranty.
Hmmm... he said. Well, long story short, he eventually kicked up the voltage (I guess the "auto" in the BIOS was flaky or something) and all worked. Required +.2 volts. Most memory sticks can be run at slightly higher (+.1, +.2) volts without harm. Larger memory may require a slight increase in voltage. I guess the "automatic" settings can't always be trusted.
Running about 6 months now, NPs.
Another thing about pulling all components you can: if there is some kind of IRQ conflict, this can (used to?) cause slowdowns. Maybe that will be shown there. But that should also leave some traces in the /var/log/messages or dmesg log.
There no add-in cards, nor a CD-ROM / DVD-ROM, only the on-board devices & the HDD's. Taking the HDD's out doesn't help much, since the problems only occur when there's a bit of load on the system.
Let's presume that the "obvious" problem is not the problem. What if it is not hardware directly?
Examine your /var/log/dmesg carefully for any "suspect" messages. I've also found that occasionally drivers selected by the system may not be exactly correct. Check the specs for mobo and add-in cards and see if it looks like the best drivers for the chip sets are loaded (lsmod and modinfo help here).
/var/log/messages didn't show anything related to the problem, at all.
Grab any old performance/diagnostics software (maybe some on this list have current knowledge - I don't) and run it. Compare to published data for same or similar systems.
Enable sar on the system, run the reports and see where the slowdowns are.
I haven't used multi-core yet, but I would first check to see if all the cores are being effectively used. Maybe top will help here? Not sure.
BIOS: some have oddball (not really, but legacy issues abound) settings that may limit amount of memory seen/used? Keep an eye out for those. Memory timings may not be properly detected and set. Check the specs for the memory and see if the BIOS has them properly set. BTW, _some_ memory and mobo combos will allow faster settings, but be careful. I haven't dinked with them for a long time, so I can't make any Q & A suggestions.
Have you upgraded to the latest BIOS on the system? Most retail mobos come with an early BIOS version that has... "issues". Check the manufacturers web site and see if there is a later BIOS.
No, I don't like BIOS upgrades unless absolutely necessary.
OTHER: Of course, you have manually "re-seated" all connections, yes? A slightly loose cable, add-in card or memory not fully seated can do things such as you describe.
Other than RAM, the only other cables to reseat are the power cables & SATA cables :)
Visually inspect cables for "micro-fractures". Better, if you have access to meters, check for excessive resistance or opens. If not, try changing out cables. You might want to look in this area only if SAR reports show slow disk activity. Also hdparm might give some information. Maybe some settings there would help too.
That's all I can think of ATM. I hope something of use here.
-- Bill
But, I did make an interesting discovery, when I tried to install a fesh copy of CentOS on a new HDD. The installation itself didn't succeed. Everytime I had to choose an option, on any screen, during installation, all the fans would spin up to it's max speed & everything would be really slow. It's almost like trying to install CentOS on a 486 computer. Yet, none of the heatsinks felt warm, even as warm as the hard drives. So, I came to the conclusion that the motherboard is faulty. Right now, I only have a spare Gigabyte motherboard handy, which when I used it didn't give me any problems whatsoever. I'm using the same 1U chassis with limited air flow and small fans, and it runs as smooth as it should.
I have since swapped out the motherboard with the supplier, and the new motherboard seems to run very well. Installation took about 20minutes to complete.
Thanx for all your help :)