On Sat, Nov 05, 2011 at 04:51:43PM -0500, Ralph Angenendt wrote:
Hi,
while I am still struggling with that host (well, small technicialities, nothing major), I'd like to start a discussion on how to convert our mirror table to the layout which is needed for mirrormanager.
You can find a description of the table at http://oerks.de/~ralph/mirrordb.txt - that probably is easier to look at than trying to get out a version via mail which doesn't break for everyone :)
Let me explain that table (and let me explain the fields we probably still need and which we don't need anymore, afaics).
Name: Is our primary key - so every mirror has to have a unique name in our DB
MM has a "Site" that matches your URL to the Sponsor's website. Each Site has a list of mirror Hosts. Each Site's name must be unique, and Within each Site, the Hosts names must be unique.
location-major: That's the continent locmajidx: numerical representation of the continent location-minor: Country (or in the case of US and Canada: State)
MM lets GeoIP handle this for us, to the country level. I haven't yet added State-level - oftentimes it really doesn't match up with network topology enough to care.
http,ftp,rsync: The URLs the mirror is at
HostCategoryURL, two types: public (default), and private for other downstream mirrors to use. Not sure these are actually used.
speed: Used for the representation on the mirrorlist at www.centos.org. Mostly T1 anyway, not needed anymore, I guess.
Nope.
bandwidth: Actual bandwidth. Not needed.
MM does need this, an integer value in Mbps (100 = 100Mbps uplink). Host.bandwidth_int.
status: set by mirror-status (at least Dead, Disabled is for manual intervention)
Each Site and each Host have two flag bigs: admin_active, and user_active. admin_active lets the MM database admin kill a mirror off quickly; user_active lets the user do this for themselves, particularly in preparation for a long outage.
state: more detailed state
?
contact-name: Name of the person running the mirror. Internal use for us. contact-tel: I cannot remember calling a mirroradmin :) contact-email: Our second unique field. I guess that will be used for login
MM only knows about a user account name we list as the mirror admin. In the Fedora world, this is the FAS account name. In RPMFusion, I expect it's a local database built into TurboGears. Pretty thin on info though, could add these other fields if we need to.
comments: Free form, normally the request mail sent to the list. Nice to have, but not needed. access*: Not used Type: We only have direct mirrors. restructured: That must have happened before 2006 :) centostext: What to add to the mirror URLs (so mostly unused)
MM has content Categories (Fedora Linux, Fedora EPEL, and historical categories). Each Host has one or more Categories = HostCategory. Each HostCategory has one or more HostCategoryURLs. The Categories can be rooted at any arbitrary URL, but from the top of the Category on down, a mirror has to maintain the upstream master directory structure.
You'll want to think about how you structure your content into Categories. It works best if a Category is a distinct subtree, not overlapping with other Categories.
url: The URL to the sponsor's website
Site.orgUrl
info_note, notes_private,infoblock,graphic_url: Not used. centos*: Which versions does the mirror carry? arch_all: Yes, if not, then: arches: Free form - only used for the mirror list on www
This is detected dynamically by MM, and exposed in the publiclist chooser.
dvd-iso: Does it carry them (always yes since 6) dvd*: The versions (6 is set to yes always) dvd-iso-host,rsync-dvd-host: No idea. Not used
Again, dynamically detected.
cc: The TLD the mirror is in. Actually used for generating mirrorlists.txt for that country
Host.country
continent: Used for the mirrorlist on www
Not used. MM uses GeoIP, and augments its mapping of countries to continents with a CountryContinentRedirect. Little used, but it maps say Israel to Europe instead of Asia, because it has better network connectivity to Europe.
centos_code,priority: Not used use-in-mirror-list: Used: We don't really put 10Mbit-machines in EU or US or CA into the mirrorlist.txt which is handed out via yum
Ah. We do, but they get listed at the top of mirrorlist.txt 1/100 as often as a 1Gb mirror would. That's the weighted random sample based on bandwidth.
I guess we can drop many of those when going over to mirrormanager. But: What I don't see on the Fedora pages is a list of all the mirrors (by country/continent/whatever) - I know that this is one thing we actually do need and want.
I don't have the breakdown by continent in /publiclist. Wouldn't be hard, but I hate mucking with that page - it took some major CSS hacking to get it as readable as it is. :-)
Anything I actually overlooked?
Do you have private mirrors in your database now? That maps to Site.private and Host.private.
Host.internet2 if a host is on Internet2 or related high-speed educational/research network. We can look that up in MM's private copy of the Internet2 route tables if needed - that's how I populated the field the first time for Fedora too.
Host.internet2_clients if a host on I2, even if private, should be listed for other I2 clients in the same country. By default set it false, let mirror admins update it themselves.
Host.asn = AS Number Host.asn_clients if a host should serve the whole ASN regardless of netblocks set. Lets mirrors in places with many netblocks, but a single ASN, get away with a single value here. Again, we can look this up in our private copy of the worldwide routing table.
Host.countries_allowed = list of countries allowed. e.g. a mirror in .il may want to only serve users in .il.
Host.netblocks is a list of netblocks that Host should be primary mirror for. This is required for private mirrors.
Host.acl_ips = list of IP addresses or hostnames that will get put into the /rsync_acl list. Other mirrors may wget /rsync_acl to get that list, and use it in their own rsyncd.conf files. Only real problem with it is anyone could sign up to be a private mirror, fill this in, and then get early access to a pre-bitflip mirror via the acl. Oh well...
I think that'll be enough to get going though.
Be thinking about categories. At a glance, I think a single Category "CentOS" would be fine. You could in theory do two Categories "CentOS" and "CentOS ISOs" and rig up update-master-directory-list to ignore /isos in your "CentOS" Category, and ignore everything but /isos in the "CentOS ISOs" Category. but I don't think that will buy you much, and it buys exactly nothing with C6 and newer.