Lamar Owen lowen@pari.edu wrote:
This seems a good application for differential or 'patch' RPM's and a CVS-like RPM repository. The mirroring requirements alone make, in my mind at least, a good case for patch RPM's that take up far less space (and take far less time to mirror, and can be mirrored in a transaction so that the repository, to the yum enduser, is ALWAYS consistent), and then a CVS-like RPM repository that stores the initial import and all patches from that import, and builds the desired RPM on the fly.
But now you're talking GBs of binary data, and "real-time" resolution by 1,000s of clients!
Yes, it's techncially possible to leverage XDelta and other support to do this. And you're going to burden your server with a massive amount of overhead that you don't have when you simply share stuff out via HTTP.
Think about it. ;->
Really, this problem has been dealt with before in the various revision control systems. To mirror the whole repository, something like CVSup that gets a consistent copy could be built. A portion of this infrastructure is available now as yam, but the underlying repository is far from ACID compliant (and now we're talking databases).
The mirroring size difference alone might make it worthwhile. But, it is likely to require more CPU; one of those trade-offs: CPU versus disk and bandwidth.
CPU, memory, disk, etc... will be _expoentially_ increased.
As I said, check in GBs of different binarie revisions to CVS and share the same out in multiple trees via HTTP. Now monitor the difference in load when you have 1, 2, 4, 8 ... 1024 clients connect!
Your CVS server is _crawling_ by just a half-dozen or so clients. Your Apache server is handling over 100 without much issue, depending on your I/O.
It is _not_ feasible, period.