On Wednesday 08 April 2009 17:36:18 R P Herrold wrote:
I would be thrilled to have a simultaneous coordinated release, but the 'leak' of 'patched' torrent instances, and at least two mirrors opening the full ISO set before the coordinated bit flip date and time, leave rather a bad outlook to me as to the ability to make things better through such an 'inverted as to demand' approach
My $0.02 .... I'd love to be shown a path to avoid the problems on the 5.3 roll-out
Russ, Karnbir, et al:
First, a marvelous job on getting the bits out at all; it ain't easy doing this for free (did it on a smaller scale with the PostgreSQL RPMs years ago). I greatly appreciate what the CentOS team does, and how rapidly it gets done. It certainly gets done faster than if I were doing it from the upstream EL's SRPMS.
Now, as to the technical issues, it seems to me that a fully ACID compliant transactional repository mirror system is possibly one way to eliminate most of these issues. Such a system to my knowledge does not yet exist; but, to use an SQL example, something like: BEGIN; MIRROR repo WHERE release = "5.3" AND arch = "x86_64"; COMMIT; and the COMMIT would atomically bring the repo and its mirrors into a consistent state, with already connected clients isolated from the changes as they are being made and with a durability of the result. (yes, wording is intentional). Errors would block the COMMIT until the errors are resolved on a mirror-by-mirror basis; that is, either a mirror shows the full consistent set or it doesn't show anything until it gets the full consistent set and a replica commit occurs. A critical mass of replicas committed would be required to cause a repo commit to avoid overload of individual mirrors.
This is doable now with databases of many gigabytes (I'm in the process of beginning reception of a long term sneakernet push replica/mirror of over 10TB of image data, and the mirror has to be atomic, and it is of course on a database system). But the current pull updating structure doesn't lend itself readily to this.
Incidentally, the MIRROR statement above is intended to project a PUSH arrangement instead of a PULL arrangement; the SQL above would be run on the master to push out the mirrors which could then propagate down hierarchically; reminds me of master-slave database replication with submasters. The fact that some mirrors are partials complicates things even further, though.
Yes, such a system would be a large technical hurdle, and perhaps it would be too complex to work in a loose volunteer arrangement. But surely other upstream projects and distributions have similar issues and needs; perhaps a transactional mirrornet 'system' would be a fine project for someone to start. If one doesn't already exist, that is.
A revision control system can be pressed/abused into this sort of service; monotone, for instance, is/was used by the OpenZaurus people to consistently push out packages, and git is being used by vyatta for similar things, mostly on the developer's side of things (see http://www.vyatta.org/downloads/glendalebuild ). Git has the distributed aspects going for it, but it's not optimal.
Just some ideas.