[CentOS] Motherboard and chipset compatibility

On 08/12/13 18:16, Warren Young wrote:
> On 8/12/2013 12:54, m.roth at 5-cent.us wrote:
>>
>> Well, *all* of these are rackmount servers, with no moving-the-server
>> wear.
>
> Our servers are all rack-mounted, too, and pretty much never get moved
> after being installed.
>
> In any case, I was referring to wear in the electromechanical components
> of a server.  HDDs and fans, primarily.  In olden days, optical disks,
> too.  These are expected to fail over time.
>
>> We start seeing userspace compute-intensive processes crashing the
>> system a number of times a day.
>
> Define "crash the system".
>

The whole system reboots.
<snip>
> I don't suppose you've gathered continuous temp data, say with Cacti?

No, I haven't. It's a thought, thought the HVACs good (too good, he says, 
when he needs a long sleeved shirt, and sometimes a sweater). ipmitool sel 
list isn't showing a problem.
>
>> They replace the m/b, and it doesn't happen again.

Oh, except for the one or two that we sent back a *second* time, and they 
replaced the m/b again....
>
> Okay, so either this one motherboard product from Supermicro has a QC
> problem, or Penguin has an application or design problem with it.  Or,
> your environment is somehow pushing them past their design limits.
> (e.g. insufficient cooling)

That's certainly not the problem.
>
> You're painting with far too broad a brush here to say Supermicro is
> bad, period.

You like them, fine. We really don't, and the only thing that we were buying 
that had their m/b, etc, were honkin' hot severs.

	mark