[CentOS] Why is yum not liked by some?

Sat Sep 10 00:48:59 UTC 2005
Lamar Owen <lowen at pari.edu>

On Friday 09 September 2005 20:19, Mike McCarty wrote:
> Lamar Owen wrote:
> > solve this problem.  Are you going to tell the community that this is an
> > unsolvable problem?

> No, I am not. Already, several people, myself included, have given
> a way for you to accomplish what you seem to want.

Well, first, Les and I are not the same person, and what Les wants and what 
I'd like to see are two different but related things.  I believe that 
incremental updates (rpm-deltas) are desireable from a bandwidth and storage 
point of view, and highly desireable from a user point of view.  They do 
present issues for repository operators and packagers, this is true.

But then Johnny mentions that the mirroring load is 50GB for the tree.  This 
is a lot of data to move around, really. 

Now, bandwidth doesn't scare me; we have one research project here that will 
be collecting 12TB of data per day (if it captures a full day at a time; 
currently not possible, but desirable). (The project involves a phased array, 
with the raw data being stored and rephased after collection; this is like 
being able to repoint a dish to an observation in the past for conventional 
radio telescopes).  This would require 2/5ths of an OC-48 to mirror; doable, 
yes, but not desireable or affordable.  Drive space doesn't scare me (except 
cost; got a quote on a petabyte-class storage array (it was 1.4PB and cost 
upwards of $3 million).  CPU horsepower doesn't scare me, either, as I'm 
getting a MAPstation as part of a different research project (now this box 
has a interesting interconnect called SNAP that scales to 64 MAP DEL 
processors and 32 host P4's on a crossbar type switch; you can do the 
research on google too).  The MAPstation runs on Linux, FWIW.  For the 
application (cross-correlation of interferometry data, 2 frequencies, 2 
polarizations, and 2 antennas) a MAP processor will have the equivalent power 
of an 800GHz P4, but be clocked at only 200MHz due to the massively 
paralleled pipelining available with this kind of direct-execution-logic 
(non-Von Neumann) processor.  But all of that is irrelevant.

What is relevant is that I have seen the end user's response to having to 
download multiple megabytes for a hundred byte or less change.  While it 
doesn't bother me, it did bother my users (speaking of the PostgreSQL users I 
built and released packages for).

So the enduser potentially could reap the best benefit of a rpmdelta system.  
SuSE is or has been doing rpmdeltas for a year now, and I seem to recall that 
the results were pretty good.

Les wanted similar to CVS functionality where you can tag a repository as 
consistent at a certain branch (not necessarily by date, as you mentioned), 
and be able to consistently grab a set of packages. 

I mentioned CVS worked on a diff principle, and that that might be an 
interesting way of doing it (all the while thinking about my PostgreSQL 
users). Maybe I confused the two issues; possible.

The dumb client glorified webserver type system will be very difficult to make 
work this we, this is true.  But who says we have to stick to a glorified 
wget?  But the key question is, cost-benefit analysis-wise, is it worth the 
effort (both development and execution)?  Maybe it is, maybe it isn't.  But I 
do believe it is worth a try, if only to help the enduser (which could be a 
small server at a parish library, for instance.... :-)). 
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu