[CentOS] Why is yum not liked by some?
lowen at pari.edu
Sat Sep 10 00:48:59 UTC 2005
On Friday 09 September 2005 20:19, Mike McCarty wrote:
> Lamar Owen wrote:
> > solve this problem. Are you going to tell the community that this is an
> > unsolvable problem?
> No, I am not. Already, several people, myself included, have given
> a way for you to accomplish what you seem to want.
Well, first, Les and I are not the same person, and what Les wants and what
I'd like to see are two different but related things. I believe that
incremental updates (rpm-deltas) are desireable from a bandwidth and storage
point of view, and highly desireable from a user point of view. They do
present issues for repository operators and packagers, this is true.
But then Johnny mentions that the mirroring load is 50GB for the tree. This
is a lot of data to move around, really.
Now, bandwidth doesn't scare me; we have one research project here that will
be collecting 12TB of data per day (if it captures a full day at a time;
currently not possible, but desirable). (The project involves a phased array,
with the raw data being stored and rephased after collection; this is like
being able to repoint a dish to an observation in the past for conventional
radio telescopes). This would require 2/5ths of an OC-48 to mirror; doable,
yes, but not desireable or affordable. Drive space doesn't scare me (except
cost; got a quote on a petabyte-class storage array (it was 1.4PB and cost
upwards of $3 million). CPU horsepower doesn't scare me, either, as I'm
getting a MAPstation as part of a different research project (now this box
has a interesting interconnect called SNAP that scales to 64 MAP DEL
processors and 32 host P4's on a crossbar type switch; you can do the
research on google too). The MAPstation runs on Linux, FWIW. For the
application (cross-correlation of interferometry data, 2 frequencies, 2
polarizations, and 2 antennas) a MAP processor will have the equivalent power
of an 800GHz P4, but be clocked at only 200MHz due to the massively
paralleled pipelining available with this kind of direct-execution-logic
(non-Von Neumann) processor. But all of that is irrelevant.
What is relevant is that I have seen the end user's response to having to
download multiple megabytes for a hundred byte or less change. While it
doesn't bother me, it did bother my users (speaking of the PostgreSQL users I
built and released packages for).
So the enduser potentially could reap the best benefit of a rpmdelta system.
SuSE is or has been doing rpmdeltas for a year now, and I seem to recall that
the results were pretty good.
Les wanted similar to CVS functionality where you can tag a repository as
consistent at a certain branch (not necessarily by date, as you mentioned),
and be able to consistently grab a set of packages.
I mentioned CVS worked on a diff principle, and that that might be an
interesting way of doing it (all the while thinking about my PostgreSQL
users). Maybe I confused the two issues; possible.
The dumb client glorified webserver type system will be very difficult to make
work this we, this is true. But who says we have to stick to a glorified
wget? But the key question is, cost-benefit analysis-wise, is it worth the
effort (both development and execution)? Maybe it is, maybe it isn't. But I
do believe it is worth a try, if only to help the enduser (which could be a
small server at a parish library, for instance.... :-)).
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC 28772
More information about the CentOS