On Tue, 2005-09-13 at 18:03, Bryan J. Smith wrote:
I want that functionality, but I was arguing that all it would take to get it is sequentially increasing timestamps on files being added to the repository and knowledge of the final timestamp of each consistent update set - and letting the yum client have that information to figure out the
rest.
But the meta-data and its dependency tree _changes_ for each point in time. What was a dependency tree one createrepo changes on the next run. That's the problem.
Yes, there is no argument that yum would have to change. However it could be a small change.
The only way to fix it currently is to have the YUM client access RPMs directly, instead of relying on the YUM repository's meta-dta.
Yum does its dependency computations on the client side based on the contents of the .hdr files (otherwise it wouldn't work when combining the contents of different repositories). It needs the .hdr files, not the RPMS. There is some magic in the repo metadata that makes the client only download the latest .hdr files but if you update often you end up with them all anyway and use only the latest. The needed change is that if you specify a point-in-time the client should toss/ignore .hdr files past that and get downrevs if available. Note that you could do this yourself with nothing but an ftp view of the repository and you'll see the client could do it directly, although I agree that repository support could make it easier.
Otherwise, there has to be some major changes at the repository-level.
I'd call it a minor change to expose an option to get backrev .hdr files when wanted.
One thing that no one mentioned about CVS is that it always stores the full ready-to-go copy of the latest version and builds the diffs backwards to earlier versions on the assumption that you are most likely to want the most recent version.
Reverse deltas. Instead of taking the original revision and rippling deltas forward, you take the latest, and do ripple deltas backward.
Yes, but what you really want to do is give the client the least he needs to make what it has into what it wants. You are always going to be going forward and clients that update regularly will always need only the diff between the current and last prior RPM.
Reverse deltas don't solve the "ripple differences" problem, but they do minimize it. They typically cut the number of deltas required if people people are pulling the last few revisions. That is typically the case in software.
If you're at revision 1.4 and you want version 1.7, the version control service of a forward delta must build all the way from 1.1 to 1.7 -- and ripple through 6 differences. In the reverse delta, it would only need to ripple 3 times -- from 1.7 back to 1.4.
*UNLESS* you aren't talking about deltas ... but *PATCHES*
If you work only 2 revs at a time there is no difference.
Patches are _not_ Deltas. Patches are like doing a full backup and an incremental since the last full backup. So if you need to restore, you only need the latest incremental and last full. There is no "ripple." So you only need *1* file for an update.
Yes, one file for the difference between any two revs which is almost always what you want - or you should be updating more often. If you need to repeat the process with multiple steps, the client can easily calculate whether it is better to collect multiple deltas and apply them or just grab the complete version it wants.
So what's the catch? Space!
So be sensible about what you keep around and make the client fall back to existing procedure if the delta it might use isn't there.
The only way is by maintaining patches on the server. That removes the overhead of run-time generation of differences via a "ripple delta" because the patches are only generated once. But that then _bloats_ the server storage.
Keep only 1 or 2 delta/patch files for the latest revs where the traffic will actually be happening and thus reduced. In the unlikely event you want something else, use the existing procedure.
Again, I don't think you understand how deltas work. ;->
I didn't realize that you wouldn't call them deltas unless you cram more than one in the same file. Do you call the first one a patch, then change the name when you append the next run? The piece everyone will want is currrent-1->current so the most benefit would come from keeping that in it's own file.