[CentOS] Why is yum not liked by some?

Tue Sep 13 23:03:55 UTC 2005
Bryan J. Smith <b.j.smith at ieee.org>

PREFACE:  I hope people have noted I've been saying out of
this thread for awhile.  I'll really try to limit my comment
to 1 here and leave it at that.

Les Mikesell <lesmikesell at gmail.com> wrote:
> I want that functionality, but I was arguing that all it
> would take to get it is sequentially increasing timestamps
> on files being added to the repository and knowledge of the
> final timestamp of each consistent update set - and letting
> the yum client have that information to figure out the
rest.

But the meta-data and its dependency tree _changes_ for each
point in time.  What was a dependency tree one createrepo
changes on the next run.  That's the problem.

The only way to fix it currently is to have the YUM client
access RPMs directly, instead of relying on the YUM
repository's meta-dta.  Otherwise, there has to be some major
changes at the repository-level.  I offered my suggestion, a
simple "hack" in the meantime.

> One thing that no one mentioned about CVS is that it always
> stores the full ready-to-go copy of the latest version and
> builds the diffs backwards to earlier versions on the
> assumption that you are most likely to want the most recent
> version.

Reverse deltas.  Instead of taking the original revision and
rippling deltas forward, you take the latest, and do ripple
deltas backward.

Xdelta does this for binaries as well.

> In a yum-ish adaptation of this you would want the diffs
> between each version to be available for the likely
> possibility that the client has the previous version
> and wants to go to the latest.  

Again, that's what a delta are!
The difference between each revision.
Forward deltas start with the original.
Reverse deltas start with the latest.

Forward deltas are like doing a full backup, and then doing
incrementals upon incrementals.  Each successive incremental
requires each other to work.  That's a PITA.

Reverse deltas don't solve the "ripple differences" problem,
but they do minimize it.  They typically cut the number of
deltas required if people people are pulling the last few
revisions.  That is typically the case in software.

If you're at revision 1.4 and you want version 1.7, the
version control service of a forward delta must build all the
way from 1.1 to 1.7 -- and ripple through 6 differences.  In
the reverse delta, it would only need to ripple 3 times --
from 1.7 back to 1.4.

*UNLESS* you aren't talking about deltas ... but *PATCHES*

Patches are _not_ Deltas.  Patches are like doing a full
backup and an incremental since the last full backup.  So if
you need to restore, you only need the latest incremental and
last full.  There is no "ripple."  So you only need *1* file
for an update.

So what's the catch?  Space!

Instead of a set of deltas (be they forward or reverse) in a
single file, well minimized, you now maintain _separate_
patch files.  In the case above, 1.1 to 1.7, you'll need to
maintain  _all_ permutations.  That's 6 + 5 + 4 + 3 + 2 + 1 =
21 patches!

So while you drastically reduce the ripple load on the
server,  you increase the storage.  Catch-22.

> Updating via binary diffs might be a good idea too, but it
> would need to be very different from CVS because the goal
> would be to minimize the traffic and make the client side
> do all the work.

You can_not_ do deltas without the _original_ delta files.
So you would have to transfer the _entire_ delta file to the
client, which is _larger_ than just the RPM.  ;->

That's the impossibility I'm talking about!  ;->

The only way is by maintaining patches on the server.
That removes the overhead of run-time generation of
differences via a "ripple delta" because the patches are only
generated once.  But that then _bloats_ the server storage.

Again, I don't think you understand how deltas work.  ;->



-- 
Bryan J. Smith                | Sent from Yahoo Mail
mailto:b.j.smith at ieee.org     |  (please excuse any
http://thebs413.blogspot.com/ |   missing headers)