[CentOS] server specifications

Mon Feb 14 14:25:29 UTC 2011
John Hodrien <J.H.Hodrien at leeds.ac.uk>

On Mon, 14 Feb 2011, Nico Kadel-Garcia wrote:

> Trust me, it's a pain in the keister in production. If the standard is
> now enabled, good: I haven't had my hands inside a server in a year, I
> admit it. (My current role doesn't call for it.) It *didn't* used to
> be standard. Are you sure it is?

I buy whole machines not bits, and it's all preconfigured.  I can't speak for
the defaults in random motherboards.

> I'm still seeing notes that the motherboards thtat support it are still
> significantly more expensive, "server grade". Unfortunately, I've worked for
> a manufacturer that repackaged consumer grade components for cheap pizza box
> servers, and we had some disagreements about where they cut corners.

There's a difference between high quality motherboards and motherboards
advertised as high quality.  But yes, you'll pay a bit more for ECC than not,
but then I'll be paying more for dual PSU, and IPMI as well.  But since I
then don't need a IP-KVM or a controllable PDU it's worth the relatively small
amount it costs.

> It's very awkward to preserve BIOS settings across BIOS updates (read:
> impossible without a manual checklist) unless your environment is so
> sophisticated you're using LinuxBIOS.

Dell BIOS updates do not affect the settings, so it's quite easy.

> Unless you've *really* invested and gotten remote KVM boxes or invested in
> Dell's DRAC or HP's remote console tools, *and set them up correctly at
> install time, and kept their network setups up to date*, they're a nightmare
> to do remotely with someone putting hands and eyes on the server. And the
> remote tools are *awful* at giving you BIOS access, often because the
> changes in screen resolution for different parts of the boot process confuse
> the remote console tools, at least if you use the standard VGA like access
> because you haven't set the console access because that *often requires
> someone to enable it from the BIOS*, which leads to a serious circular
> dependency.

Speaking for Dell here:

Generally speaking, get a machine that supports IPMI.  A remote
Serial-Over-LAN session can be initiated just nicely for editing bios
settings if you need human driven remote BIOS tweaking.  Same as you would if
you were stood at it.  If you have a Dell, syscfg lets you edit a large number
of the BIOS settings from within linux, with an interface that doesn't vary
between models.  Also useful when you get a replacement motherboard / new
machine as you can script it.  That's all done through smbios as far as I
know.

All the IPMI stuff is configurable either through IPMITool or OMSA.  Through
OMSA it's identical across at least the last 3 generations of servers, and
nigh on identical through IPMITool.

> Now scale by a stack of slightly different models of servers with
> diferent interfaces for their BIOS management, and you have a mess to
> manage. I *LOVE* environments where the admins have been able to
> insist on, or install, LinuxBIOS because this is *solved* there. You
> can get at it from Linux userland as necessary, they reboot *much*
> faster, and you can download and backup the configurations for system
> reporting. It's my friend.

Standardisation is great, so yes, I'd love something like LinuxBIOS across the
board.  But without something like this, it's still something you can cope
with.

> Dells are solid, server class machines. I've seen HP oversold with a
> lot of promises about management tools that don't work that well, for
> tasks better integrated and managed by userland tools that *have to be
> done anyway*, and sold with a lot of genuinely unnecessary features.
> (Whose bright idea was it to switch servers to laptop hard drives?
> E-e-e-e-e-w-w-w-w-w!!!"

I hope this isn't a general dig at 2.5" disks?

> ECC has a point, which I've acknowledged. But the overall "server
> class" hardware costs add up fast. SAS hard drives, 10Gig ethernet
> ports, dual power supplies, built-in remote KVM, expensive racking
> hardware, 15,000 RPM drives instead of 10,000 RPM, SAS instead of
> SATA, etc. all start adding up really fast when all you need is a
> so-called "pizza box".

But you *are* adding on lots of extras there that don't come pre-bundled with
ECC.  Hey, my *desktop* has ECC memory...

> This is one reason I've gotten fond of virtualization. (VMWare or
> VirtualBox for CentOS 5, we'll see about KVM for RHEL and CentOS 6).
> Amortizing the costs of a stack of modest servers with such server
> class features across one central, overpowered server and doling out
> environments as necessary is very efficient and avoids a lot of the
> hardware management problems.

Sure.

> It's the overall "enterprise class hardware" meme that I'm concerned
> about for a one-off CentOS grade server.

CentOS grade?

> Are you sure it was fixed by memory replacement? Because I've seen
> most of my ECC reports as one-offs, never to recur again.

Yes.  Reset the counters, retripped the warning.  Moved the DIMM, problem
followed the DIMM.  Replaced the DIMM, all well again.

>> Equally I've had file servers do the same.  Running a file server without ECC
>> is a recipe for disaster, as you're risking silent data corruption.
>
> Core file servers, I'd agree, although a lot of the more common
> problems (such as single very expensive fileserver failure and lack of
> user available snapshots) are ameliorated by other approaches.
> (Multiple cheap SATA external hard drives for snapshot backups, NFS
> access so the users can recover personally deleted files, single
> points of failure in upstream connectivity, etc.)

Yes there are other requirements other than just sound hardware, but that
doesn't mean sound hardware isn't a good starting point.

jh