On Thursday 08 September 2005 15:12, Les Mikesell wrote:
On Thu, 2005-09-08 at 13:07, Johnny Hughes wrote:
So, seriously, the best thing would be for you to create a directory that contains all your RPMS ... you put only the ones that you have approved in there. (You do not need to build anything from SRPMS). You make that accessible from the web and run createrepo on it.
OK, but I basically want to include all official updates here but I just want to delay/control rolling them out to make sure there are no surprises. That means I need to copy that whole repository (of a size you said was such a problem mirroring that you had to break it at the point releases) and repeat the copy for every state where I might want repeatable updates or I have to track every change. I do realize that both of these options are possible, I just don't see why anyone considers them desirable. Compare it to how you get a set of consistent updates from a cvs repository where someone has tagged the 'known good' states as the changes were added.
One of the key reasons that CVS works so well for source is that, once the initial import is done, everything is done via diffs and patches. This makes the repository smaller, and automatically makes the things CVS does well (multiple versions, consistent repository states) done. While a CVS commit is in progress, for instance, other users still see the previous state; this is not true for a YUM repository. When the mirroring of the repo takes so long, the result becomes that the repositories on the mirrors could, during heavy updates, be in an inconsistent state more often that not. As the percentage of inconsistent states versus consistent states rise, the usefullness of the repository falls exponentially.
This seems a good application for differential or 'patch' RPM's and a CVS-like RPM repository. The mirroring requirements alone make, in my mind at least, a good case for patch RPM's that take up far less space (and take far less time to mirror, and can be mirrored in a transaction so that the repository, to the yum enduser, is ALWAYS consistent), and then a CVS-like RPM repository that stores the initial import and all patches from that import, and builds the desired RPM on the fly. Really, this problem has been dealt with before in the various revision control systems. To mirror the whole repository, something like CVSup that gets a consistent copy could be built. A portion of this infrastructure is available now as yam, but the underlying repository is far from ACID compliant (and now we're talking databases).
The mirroring size difference alone might make it worthwhile. But, it is likely to require more CPU; one of those trade-offs: CPU versus disk and bandwidth.