Mon Sep 8 20:46:18 UTC 2014
Valeri Galtsev <galtsev at kicp.uchicago.edu>

It begins with random occasional errors, and ends up totally dead in a
course of couple of weeks to couple of Months. You pull CPUs and RAM from
this dead one stick into another (I'm tempted to say "tyan this time" ;-),
and these work. At this point you can't flash BIOS - not in house. My
hunch is: this is engineering flaw, it looks like the board topology isn't
too good around one of the CPU sockets, so it's marginally works (without
much reserve) while system board is new, then with slight gradual
degradation of components... Maybe the ripple on the leads is below but
close to tolerable. Or capacitances and inductances [of the board leads]
involved are such. I can't offer [much] more detail on what I observed,
it's been some time since I banged my head around that. And you can
imagine how happy I was to forget about it after I gave up on them.
Anyway, there are still SiperMicro boards in our stalls which are still
kicking. So I'm not saying all of them, I just don't care to learn on my
hide which are and which are not.

Oh, BTW, all electrolytic capacitors on these strangely died SuperMicro
boards are OK. All of us have seen those around CPUs on dead system boards
(mostly manufactured during some period of time) mostly bulging and leaked
out - not in case of these strangely died boards. Some of capacitors can
loose capacitance to some extent without showing signs of anything, but
good engineering usually takes that into account, and uses to necessary
extent larger ones, so that doesn't even comes close to margin during
equipment life.


