[CentOS] Yum / Up2date issues and mirror.centos.org

Tue Nov 29 21:39:47 UTC 2005
Les Mikesell <lesmikesell at gmail.com>

On Tue, 2005-11-29 at 15:11, Bryan J. Smith wrote:

> > There are places where you might want to hand-configure
> > IP addresses too, but DHCP is a lot handier.
> 
> So what's the difference between configuring your system to
> use DHCP and configuring your system to use a proxy?  I
> honestly don't get it.  @-o

DHCP is the default.  There is a difference between configuring
and not having to configure.  Talk a large number of people
through setting up a system over the phone and you'll get it.

> > How is that a solution?  Proxies are used where you don't
> > allow direct outbound access.  How do you do ftp without
> > configuring a proxy on every client?
> 
> The question is, why aren't you configuring software for a
> proxy in the first place?  You do it once ... done.|

Once per install. And then you can't move the box.

> Why don't you just configure it at install-time, like
> everything else?  Again, I don't understand how this is
> different than anything else you configure at install-time.

The difference is that dhcp configures everything else 
you need.

> Furthermore, we're back to the "how to you change anything on
> all systems when you need to?"  Don't you have some sort of
> configuration management of all your Linux systems? 
> Something that can redistribute system changes to all
> systems?
> 
> This has nothing to do with YUM.

I want my system changes to be what the experts updating
whatever distribution happens to be on a box have put
in their yum repository, so yes it does have to do with
yum.

> > OK - ftp breaks when you NAT it too - sometimes.
> 
> I'm not talking about just FTP, I'm talking about HTTP too. 
> HTTP can and _does_ break because it's a stream protocol that
> carries a lot of rich service data over it.  Some of those
> rich service data streams don't take kindly to transparent
> proxies.

Yum doesn't need 'rich' services or it couldn't work over
ftp.

> > Yes, just mirror the whole internet locally - or at least
> > all yummable repositories...
> 
> Of the packages you use, yes.  Take some load off the CentOS
> mirrors if you have enough systems.

The point of using yum is that I don't need to know about the
packages ahead of time.  Running through a proxy cache
automatically takes the load off the repository instead of
adding to it by sucking copies of apps I don't ever install.

> > And all of the fedora repositories, and all the 3rd party
> > add on repositories, and the k12ltsp variations, and the
> > ubuntu/debian apt repositories.
> 
> Yes!  Once you have the first sync, it is not much to
> download a day.  In fact, if you're about conserving the
> bandwidth you use for updates, hell yes! 

How can it conserve bandwidth making copies of updates to
programs that aren't installed anywhere?

>  If your point is
> that you have all those repositories to sync from and that is
> a "burden," then my counter-point is "Exactly!  You're
> yanking from all those different repositories from _multiple_
> systems already -- so why not just do it from _one_?"  ;->

The machines I update are at several different locations so
it doesn't make a lot of sense to do them all from the same
local mirror.

> When you have a number of systems, there is _no_negative_ to
> this, other than having the disk space required!  APT And YUM
> repositories are "dumb" FTP/HTTP stores.  rsync down and
> serve.  Save your bandwidth and save your headaches.
> 
> > It doesn't make sense to cache things unless at least one
> > person uses it.
> 
> Now I'm really confused.  If you're not using a repository,
> then do _not_ mirror it. 

There's no difference between repositories and any other ftp/web
site in this respect.  I don't know ahead of time who is going
to want to update what distribution any more than I'd know what
other downloads could be mirrored ahead of time.  

>  I don't understand that point you
> just made.  Or are you adding yet more unrelated items just
> to make a point?

I want to only download things that are actually needed, and
only once per location.  Caching proxies have gotten that
right for ages.

> > The point of the internet is that you can get the latest
> > when you need it, and the point of a cache is that only one
> > person has to wait.
> 
> We're talking about software repositories.  If you are
> pulling multiple files from multiple systems, mirror it. 
> These aren't some arbitrary web sites, they are known
> repositories.

But they change all the time - and some are huge with only
one app being pulled from it.

> If you have enough systems, you should be doing this anyway
> -- out of sheer configuration management principles.  You
> don't want people grabbing arbitrary software on a mass
> number of systems, but only what you allow from your own
> repositories.

That would imply that I'd have to test everything first or
no one could take advantage of something I didn't permit. I
don't scale that well...

> > Yes, CentOS is as much a victim as the other distros on
> > this point.
> 
> I just don't know what you expect CentOS to solve.

Their unique problem is that they are cache-friendly by using
RRDNS to distribute load instead of a mirrorlist, and yum
is not too bright about retrying alternate addresses on errors.
I'm hoping they come up with a way around this that won't
screw up caching like fedora does.

-- 
  Les Mikesell
   lesmikesell at gmail.com