PREFACE: I hope people have noted I've been saying out of this thread for awhile. I'll really try to limit my comment to 1 here and leave it at that.
Les Mikesell lesmikesell@gmail.com wrote:
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the
rest.
But the meta-data and its dependency tree _changes_ for each point in time. What was a dependency tree one createrepo changes on the next run. That's the problem.
The only way to fix it currently is to have the YUM client access RPMs directly, instead of relying on the YUM repository's meta-dta. Otherwise, there has to be some major changes at the repository-level. I offered my suggestion, a simple "hack" in the meantime.
One thing that no one mentioned about CVS is that it always stores the full ready-to-go copy of the latest version and builds the diffs backwards to earlier versions on the assumption that you are most likely to want the most recent version.
Reverse deltas. Instead of taking the original revision and rippling deltas forward, you take the latest, and do ripple deltas backward.
Xdelta does this for binaries as well.
In a yum-ish adaptation of this you would want the diffs between each version to be available for the likely possibility that the client has the previous version and wants to go to the latest.
Again, that's what a delta are! The difference between each revision. Forward deltas start with the original. Reverse deltas start with the latest.
Forward deltas are like doing a full backup, and then doing incrementals upon incrementals. Each successive incremental requires each other to work. That's a PITA.
Reverse deltas don't solve the "ripple differences" problem, but they do minimize it. They typically cut the number of deltas required if people people are pulling the last few revisions. That is typically the case in software.
If you're at revision 1.4 and you want version 1.7, the version control service of a forward delta must build all the way from 1.1 to 1.7 -- and ripple through 6 differences. In the reverse delta, it would only need to ripple 3 times -- from 1.7 back to 1.4.
*UNLESS* you aren't talking about deltas ... but *PATCHES*
Patches are _not_ Deltas. Patches are like doing a full backup and an incremental since the last full backup. So if you need to restore, you only need the latest incremental and last full. There is no "ripple." So you only need *1* file for an update.
So what's the catch? Space!
Instead of a set of deltas (be they forward or reverse) in a single file, well minimized, you now maintain _separate_ patch files. In the case above, 1.1 to 1.7, you'll need to maintain _all_ permutations. That's 6 + 5 + 4 + 3 + 2 + 1 = 21 patches!
So while you drastically reduce the ripple load on the server, you increase the storage. Catch-22.
Updating via binary diffs might be a good idea too, but it would need to be very different from CVS because the goal would be to minimize the traffic and make the client side do all the work.
You can_not_ do deltas without the _original_ delta files. So you would have to transfer the _entire_ delta file to the client, which is _larger_ than just the RPM. ;->
That's the impossibility I'm talking about! ;->
The only way is by maintaining patches on the server. That removes the overhead of run-time generation of differences via a "ripple delta" because the patches are only generated once. But that then _bloats_ the server storage.
Again, I don't think you understand how deltas work. ;->