Les Mikesell wrote:
Most of our machines have 5 or so NICs, each connected to special purpose subnets. And even the ones that only need 1 or 2 connections will have the same physical setup so the servers are reusable.
Trunk all the ports and use VLANs ?
Of course, but that's the point. If you've had old Cisco switches that didn't auto negotiate well, you'll have all of the connected equipment set to force full duplex. Then when you replace the switch you have to undo that - probably one subnet at a time. How do you manage real-world things like that with a configuration tool?
Set the new switches to be forced full duplex too and go in and fix the systems with a script or by hand, or just rebuild them (very few of my systems have data on their local drives that is valuable, everything is stored or transferred to centralized storage)
Easier than an ssh loop that does a 'yum update xxx' or similar command across a set of machines.
Depends on your needs, for me a ssh loop wouldn't cut it. It works ok on a tiny scale. Having a management system like puppet or cfengine also makes sure the state is kept the same. If someone goes in and changes the passwd file or overwrites ntp.conf cfengine reverts the change within the hour.
I can't quite deal with the idea of needing to abstract OS commands and doing it in a way that still only works with one OS. Why not either just automate the actual commands you need to run, or fix the commands in the first place if they are so bad that you have to abstract them into some new language. And RHEL/Centos boxes are a small part of the operation at the moment.
It's not just commands, it's configuration as well, configuration that is different depending on the system's purpose, what data center it's located in, what time of day it is, what applications it's responsible for.
And that's supposed to be the easy way?
It is when you have as many moving parts as we do yes. I've tried other methods, for years I was using the ssh loop route because we were so slammed we had no time to learn a proper way to manage systems, once we learned the proper way things are soooo much easier.
You wouldn't believe the lengthy list of commands needed to build a system from the ground up before I re-wrote everything so it is automated. And the lengthy list of commands only covered a couple particular type of systems. The rest were built from hand from memory. I had to go in and learn how everything was setup and automate it.
Right now I have roughly 150 classes of systems, each defines a subset of the infrastructure that gets a particular type of configuration enforced on it. Most of those systems are added into the classes dynamically by their host name or other system properties(such as a script to detect whether or not a system is a VM).
The head QA guy here was able to build his new VM-based environment in about 2 weeks(roughly 70 VMs) because of this, previously he said it would of taken many, many months.
Could you switch arbitrary boxes to windows or some other OS without changing what the operators see? If you are still tied to the arcana of the underlying system - and vulnerable to its changes, what does this get you?
cfengine runs on many systems windows included. I haven't run it myself on anything other than linux. I don't get involved with windows stuff, keeps my stress levels lower.
http://cfengine.com/pages/nova_supported_os
I use an older version of cfengine not the nova stuff.
So it all depends on what your needs are, certainly puppet and cfengine type tools are not for everyone, I wouldn't even recommend them for small deployments(say less than 50 servers). If your one person responsible for servers numbering in the hundreds, or your team is responsible for servers numbering in the thousands then tools like them are priceless.
nate