Karanbir Singh wrote:
I personally think that cloning is for people who dont know what they are doing.
Of course - that's a feature, though, not a bug. You want the people who bolt the machines in the racks to be able to do it - and you don't want it to dictate to you what OS they can install.
Bare metal install and provision into a role with even a moderately usable app deployment strategy, cap'd with a monitoring system that actually know what it needs to do - will easily be more friendly than any form of cloning.
OK, just as soon as one exists that works across platforms for people who don't know what they are doing.
I'm not sure I agree with that. I really do want to know about platform/state history even if I can't roll it back.
well, most people replace or upgrade the platform ( hardware ) when things break, so rolling back is sort of academic in this sense. You are much better off storing platform inventory in something like ocs / glpi / spacewalk ( or even doing things like getting sosreport to give you the right foo, checking that into a local vcs - so you can 'track' a system's evolution )
A pre-configured ocs agent is part of our base images, so I have that as soon as the network is set up, but I'd like more detail.
For example, if someone changes the duplex setting on a NIC to match a switch I'd like to have the change recorded - and a way to look at how that machine is different from both the way it was at some other time and from other similar machines.
how do you mean 'changes' ?
I'd like to have the whole /etc tree handled more or less like rancid does for cisco configs - that is, toss the whole thing into cvs/subversion, etc., but with similar machines handled as branches to compress the space and give you an easy way to see diffs. And, of course, something similar for the windows registry.
I highly recommend there be no way for people to get onto the machine unless in an emergency.
It is always an emergency except for things like content on the web servers (which mostly gets there via rsync over ssh with underlying versioning).
Also, we almost never roll out a change across all machines in a group at the same time but instead closely schedule individual machines or small sets.
puppet, atleast, makes this sort of a thing trivial - since you can setup environments, and then nominate machines to join different environments on demand. It also make it possible for one to have code-like release and deployment cycles.
You could potentially even have various test and/or wrap some of the policy in tests with rollback capabilities ( I tend to do that for my ssh and puppet configs - configs for puppet itself that is ).
Just dreaming here, but I think this stuff really belongs in something like Hudson with a drools plugin where you could automate all the way up from the source code through testing and deployment. That would start out with something that already has a cross platform scheduling/execution capability and knows more nuts and bolts about programming and platform differences than most admin-only tools.