On Tue, Jul 8, 2014 at 11:25 AM, Lamar Owen <lowen at pari.edu> wrote: > Memory tests are redundant with ECC. (I > know; I have an older SuperMicro server here that passes memory testing > in POST but throws nearly continuous ECC errors in operation; it does > operate, though). If it fails during spinup, flag the failure while > spinning up another server. I don't think that is generally true. I've seen several IBM systems disable memory during POST and come up running will a smaller amount. > Virtual servers have no need of POST (they also don't save as much > power; although dynamic load balancing can do some predictive heuristics > and spin up host hypervisors as needed and do live migration of server > processes dynamically). Our services that need scaling need all of the hardware capability and aren't virtualized. That might change someday... > To detect failures early, spin up every server in a rotating sequence > with a testing instance, and skip POST entirely. > > If you have to, spin up the server in a stateless mode and put it to > sleep. Then wake it up with dynamic state. Our servers tend to just run till they die. If we didn't need them we wouldn't have bought them in the first place. I suppose there are businesses with different processes that come and go, but I'm not sure that is desirable. > Long POSTs need to go away, with better fault tolerance after spinup > being far more desirable, much like the promise of the old as dirt > Tandem NonStop system. (I say the 'promise' rather than the > 'implementation' for a reason.....). If you need load balancing anyway you just run enough spares to cover the failures. -- Les Mikesell lesmikesell at gmail.com