[CentOS-mirror] mirrormanager: Database conversion
Matt Domsch
Matt_Domsch at dell.com
Mon Nov 7 00:40:49 EST 2011
On Sat, Nov 05, 2011 at 04:51:43PM -0500, Ralph Angenendt wrote:
> Hi,
>
> while I am still struggling with that host (well, small technicialities,
> nothing major), I'd like to start a discussion on how to convert our
> mirror table to the layout which is needed for mirrormanager.
>
> You can find a description of the table at
> http://oerks.de/~ralph/mirrordb.txt - that probably is easier to look at
> than trying to get out a version via mail which doesn't break for
> everyone :)
>
> Let me explain that table (and let me explain the fields we probably
> still need and which we don't need anymore, afaics).
>
> Name: Is our primary key - so every mirror has to have a unique name in
> our DB
MM has a "Site" that matches your URL to the Sponsor's website. Each
Site has a list of mirror Hosts. Each Site's name must be unique, and
Within each Site, the Hosts names must be unique.
> location-major: That's the continent
> locmajidx: numerical representation of the continent
> location-minor: Country (or in the case of US and Canada: State)
MM lets GeoIP handle this for us, to the country level. I haven't yet
added State-level - oftentimes it really doesn't match up with network
topology enough to care.
> http,ftp,rsync: The URLs the mirror is at
HostCategoryURL, two types: public (default), and private for other
downstream mirrors to use. Not sure these are actually used.
> speed: Used for the representation on the mirrorlist at www.centos.org.
> Mostly T1 anyway, not needed anymore, I guess.
Nope.
> bandwidth: Actual bandwidth. Not needed.
MM does need this, an integer value in Mbps (100 = 100Mbps uplink). Host.bandwidth_int.
> status: set by mirror-status (at least Dead, Disabled is for manual
> intervention)
Each Site and each Host have two flag bigs: admin_active, and
user_active. admin_active lets the MM database admin kill a mirror off quickly;
user_active lets the user do this for themselves, particularly in
preparation for a long outage.
> state: more detailed state
?
> contact-name: Name of the person running the mirror. Internal use for us.
> contact-tel: I cannot remember calling a mirroradmin :)
> contact-email: Our second unique field. I guess that will be used for login
MM only knows about a user account name we list as the mirror admin.
In the Fedora world, this is the FAS account name. In RPMFusion, I
expect it's a local database built into TurboGears. Pretty thin on
info though, could add these other fields if we need to.
> comments: Free form, normally the request mail sent to the list. Nice to
> have, but not needed.
> access*: Not used
> Type: We only have direct mirrors.
> restructured: That must have happened before 2006 :)
> centostext: What to add to the mirror URLs (so mostly unused)
MM has content Categories (Fedora Linux, Fedora EPEL, and historical
categories). Each Host has one or more Categories = HostCategory.
Each HostCategory has one or more HostCategoryURLs. The Categories
can be rooted at any arbitrary URL, but from the top of the Category
on down, a mirror has to maintain the upstream master directory
structure.
You'll want to think about how you structure your content into
Categories. It works best if a Category is a distinct subtree, not
overlapping with other Categories.
> url: The URL to the sponsor's website
Site.orgUrl
> info_note, notes_private,infoblock,graphic_url: Not used.
> centos*: Which versions does the mirror carry?
> arch_all: Yes, if not, then:
> arches: Free form - only used for the mirror list on www
This is detected dynamically by MM, and exposed in the publiclist
chooser.
> dvd-iso: Does it carry them (always yes since 6)
> dvd*: The versions (6 is set to yes always)
> dvd-iso-host,rsync-dvd-host: No idea. Not used
Again, dynamically detected.
> cc: The TLD the mirror is in. Actually used for generating
> mirrorlists.txt for that country
Host.country
> continent: Used for the mirrorlist on www
Not used. MM uses GeoIP, and augments its mapping of countries to
continents with a CountryContinentRedirect. Little used, but it maps
say Israel to Europe instead of Asia, because it has better network
connectivity to Europe.
> centos_code,priority: Not used
> use-in-mirror-list: Used: We don't really put 10Mbit-machines in EU or
> US or CA into the mirrorlist.txt which is handed out via yum
Ah. We do, but they get listed at the top of mirrorlist.txt 1/100 as
often as a 1Gb mirror would. That's the weighted random sample based
on bandwidth.
> I guess we can drop many of those when going over to mirrormanager. But:
> What I don't see on the Fedora pages is a list of all the mirrors (by
> country/continent/whatever) - I know that this is one thing we actually
> do need and want.
I don't have the breakdown by continent in /publiclist. Wouldn't be
hard, but I hate mucking with that page - it took some major CSS
hacking to get it as readable as it is. :-)
> Anything I actually overlooked?
Do you have private mirrors in your database now? That maps to
Site.private and Host.private.
Host.internet2 if a host is on Internet2 or related high-speed
educational/research network. We can look that up in MM's private copy
of the Internet2 route tables if needed - that's how I populated the
field the first time for Fedora too.
Host.internet2_clients if a host on I2, even if private, should be
listed for other I2 clients in the same country. By default set it
false, let mirror admins update it themselves.
Host.asn = AS Number
Host.asn_clients if a host should serve the whole ASN regardless of
netblocks set. Lets mirrors in places with many netblocks, but a
single ASN, get away with a single value here. Again, we can look
this up in our private copy of the worldwide routing table.
Host.countries_allowed = list of countries allowed. e.g. a mirror in
.il may want to only serve users in .il.
Host.netblocks is a list of netblocks that Host should be primary
mirror for. This is required for private mirrors.
Host.acl_ips = list of IP addresses or hostnames that will get put
into the /rsync_acl list. Other mirrors may wget /rsync_acl to get
that list, and use it in their own rsyncd.conf files. Only real
problem with it is anyone could sign up to be a private mirror, fill
this in, and then get early access to a pre-bitflip mirror via the
acl. Oh well...
I think that'll be enough to get going though.
Be thinking about categories. At a glance, I think a single Category
"CentOS" would be fine. You could in theory do two Categories
"CentOS" and "CentOS ISOs" and rig up update-master-directory-list to
ignore /isos in your "CentOS" Category, and ignore everything but
/isos in the "CentOS ISOs" Category. but I don't think that will
buy you much, and it buys exactly nothing with C6 and newer.
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
More information about the CentOS-mirror
mailing list