[CentOS-mirror] mirrormanager: Database conversion

Mon Nov 7 22:05:59 UTC 2011
Ralph Angenendt <ralph.angenendt at gmail.com>

On 07.11.2011 06:40, Matt Domsch wrote:
> On Sat, Nov 05, 2011 at 04:51:43PM -0500, Ralph Angenendt wrote:

>> Name: Is our primary key - so every mirror has to have a unique name in
>> our DB
> 
> MM has a "Site" that matches your URL to the Sponsor's website.  Each
> Site has a list of mirror Hosts.  Each Site's name must be unique, and
> Within each Site, the Hosts names must be unique.

Okay, the site URL isn't unique per se, but unique per sponsor (if he
has many hosts).

>> location-minor: Country (or in the case of US and Canada: State)
> 
> MM lets GeoIP handle this for us, to the country level. I haven't yet
> added State-level - oftentimes it really doesn't match up with network
> topology enough to care.

That is only for representation on the website anyway. What's used for
generating the lists is in cc: (Host.country).

>> http,ftp,rsync: The URLs the mirror is at
> 
> HostCategoryURL, two types: public (default), and private for other
> downstream mirrors to use.  Not sure these are actually used.

Hmm? Does not compute: Those are the actual URLs of the mirror content.

>> bandwidth: Actual bandwidth. Not needed.
> 
> MM does need this, an integer value in Mbps (100 = 100Mbps uplink).  Host.bandwidth_int.

Okay. As this is free form for us, this needs normalizing, then.


>> state: more detailed state
> 
> ?

The reason why it was disabled (lagging, non-responsive and so on.
Nobody really uses that).

>> contact-name: Name of the person running the mirror. Internal use for us.
>> contact-tel: I cannot remember calling a mirroradmin :)
>> contact-email: Our second unique field. I guess that will be used for login
> 
> MM only knows about a user account name we list as the mirror admin.

Hmmm. contact-email in that case.

> In the Fedora world, this is the FAS account name.  In RPMFusion, I
> expect it's a local database built into TurboGears.  Pretty thin on
> info though, could add these other fields if we need to.

I don't know if they are needed. But we need a user db, we have no
general account system in our infrastructure.

> MM has content Categories (Fedora Linux, Fedora EPEL, and historical
> categories).  Each Host has one or more Categories = HostCategory.
> Each HostCategory has one or more HostCategoryURLs.  The Categories
> can be rooted at any arbitrary URL, but from the top of the Category
> on down, a mirror has to maintain the upstream master directory
> structure.

Ummm. I need to digest that :) (but we require the mirrors to have the
same structure, otherwise they won't show up in the mirrorlist yum uses).

> You'll want to think about how you structure your content into
> Categories.  It works best if a Category is a distinct subtree, not
> overlapping with other Categories.

I guess something like Releases would be best here? 4, 5, 6, 7 ... We
don't have anything else.


> This is detected dynamically by MM, and exposed in the publiclist
> chooser.

Great, because entering and checking that can be a PITA :)

>> dvd-iso: Does it carry them (always yes since 6)
>> dvd*: The versions (6 is set to yes always)
>> dvd-iso-host,rsync-dvd-host: No idea. Not used
> 
> Again, dynamically detected.

When switching to mirrormanager everyone will get the dvds anyway. We'll
drop the double tree then.

> Not used.  MM uses GeoIP, and augments its mapping of countries to
> continents with a CountryContinentRedirect.  Little used, but it maps
> say Israel to Europe instead of Asia, because it has better network
> connectivity to Europe.

Okay. We had rather longish discussions on this list about those mappings :)


>> centos_code,priority: Not used
>> use-in-mirror-list: Used: We don't really put 10Mbit-machines in EU or
>> US or CA into the mirrorlist.txt which is handed out via yum
> 
> Ah.  We do, but they get listed at the top of mirrorlist.txt 1/100 as
> often as a 1Gb mirror would.  That's the weighted random sample based
> on bandwidth.

Yeah, that is fine.

> I don't have the breakdown by continent in /publiclist.  Wouldn't be
> hard, but I hate mucking with that page - it took some major CSS
> hacking to get it as readable as it is. :-)

What Adrian pointed me too looks okay.Unless one of the mirror sponsors
object (anyone still with us here?).

>> Anything I actually overlooked?
> 
> Do you have private mirrors in your database now?  That maps to
> Site.private and Host.private.

No. And I am not sure if we want to allow them - but that is open to
discussion.

> Host.internet2  if a host is on Internet2 or related high-speed
> educational/research network. We can look that up in MM's private copy
> of the Internet2 route tables if needed - that's how I populated the
> field the first time for Fedora too.

Okay. I think I like that idea.

> Host.internet2_clients  if a host on I2, even if private, should be
> listed for other I2 clients in the same country.  By default set it
> false, let mirror admins update it themselves.

Yeah, that's fine, too.

> Host.asn = AS Number
> Host.asn_clients if a host should serve the whole ASN regardless of
> netblocks set.  Lets mirrors in places with many netblocks, but a
> single ASN, get away with a single value here.  Again, we can look
> this up in our private copy of the worldwide routing table.

Wonderful.

> Host.countries_allowed = list of countries allowed.  e.g. a mirror in
> .il may want to only serve users in .il.

Hmmm. Okay, I can understand that in countries with few mirrors.


> Host.acl_ips = list of IP addresses or hostnames that will get put
> into the /rsync_acl list.  Other mirrors may wget /rsync_acl to get
> that list, and use it in their own rsyncd.conf files.  Only real
> problem with it is anyone could sign up to be a private mirror, fill
> this in, and then get early access to a pre-bitflip mirror via the
> acl.  Oh well...

:)

> I think that'll be enough to get going though.
> 
> Be thinking about categories.  At a glance, I think a single Category
> "CentOS" would be fine.  You could in theory do two Categories
> "CentOS" and "CentOS ISOs" and rig up update-master-directory-list to
> ignore /isos in your "CentOS" Category, and ignore everything but
> /isos in the "CentOS ISOs" Category.  but I don't think that will
> buy you much, and it buys exactly nothing with C6 and newer.

No, it probably won't. I was thinking in Releases, maybe, they don't
fluctuate that fast as in Fedoraland.

I am actually populating the machine now with a tree and will do some
toying around with mirrormanager during this week, to see what I am up
against.

I might drop some questions on IRC :)

Cheers and thanks,

Ralph