[CentOS-mirror] Mirrormanager (now for real)

Tue Oct 11 21:38:44 UTC 2011
Ralph Angenendt <ralph.angenendt at gmail.com>

On 10/10/2011 06:48 AM, Matt Domsch wrote:
> On Fri, Oct 07, 2011 at 03:55:30PM -0500, Ralph Angenendt wrote:
>> As several people have offered to help moving our selfbotched system to
>> mirrormanager (sorry Peter, but there were some offers - I am still not
>> sure which one really is better technically, but the more helping hands,
>> the better), I'd like to start this now.
> Excellent.  I'm happy to help in any way I can.

Thank you (and thank you Adrian), I'm just going to reply to this post.

>> Can anyone running a mirrormanager instance tell me, what kind of specs
>> that machine needs to have?
> There are three primary applications: 1) the web pages for mirror admins to
> login and change their mirror data, which on Fedora is using ~160MB RAM;
> 2) various cronjobs (~50MB); 3) the mirrorlist request handler
> (125MB).  1 and 3 can run on one, or many, machines. 2) by the nature
> of the jobs only runs on a single system.
> MM maintains a local cache in /var/lib/mirrormanager.  Fedora's copy
> thereof is 28MB.

Okay. And as we need to have the Database running on the same host (as
the machines we have are rather spread around the world than being
together in one nice and cozy data center) the machine I just "found"
which has 512MB RAM seems to be to small to do that.

So first thing: Look for another machine :)

>> Does it need to hold a copy of the mirror?
> Machine 2 above needs a copy of the mirror, yes, either local or
> NFS-mounted.  Some aspects, like update-master-directory-list and the
> crawler, can simply grab an rsync listing from another system, rather
> than have the full mirror be mounted; however, the metalink generator
> does need to be able to read files, so having a full mirror nearby
> would be good.  Fedora has the directory tree on an NFS-mountable
> volume, which the MM cronjob server mounts read-only.

Okay. Finding a machine which can hold the complete tree is trivial, all
of ours can do that.

>> On the software side, I guess httpd and mysql-server (and mirrormanager)
>> - anything else? Or is the sqlite variant fast enough for the amount of
>> mirrors we have?
> Because there are several actors updating the database simultaneously
> (particularly the update-master-directory-list cronjob and the
> crawlers), it's preferred to use postgres or mysql rather than
> sqlite.  I haven't tried running with sqlite in production.  MM uses
> TurboGears, which uses SQLObject currently (soonish SQLAlchemy) so you
> are free to choose whichever database backend you want.

Hmmm, good. If MySQL is an option - we run other instances of that, then
there's no need to to run a postgres instance. MirrorManager doesn't
take advantage of the inetnum (or what it is called) data type in
postgres which is able to store IP addresses and CIDR data?

>> At the moment we run quite a few instances of mirrorlist.ceVntos.org -
>> the machines which hand out the urllist to machines - is that possible
>> with mirrormanager, too? Or will one machine be able to handle the load?
> As Adrian noted, this is not only possible, but recommended.  Fedora
> has ~8 servers handing out the mirrorlist by request, and they don't
> even blink at the load. 

Fine. I think we can reuse the machines we have. How can that info be
copied? At the moment I copy a tree which has the mirrorlist info for
any given country/release/repo and a perl module which then grabs that
info and gives it out. How does that work in mirror manager?

> I would also recommend writing a script to convert your existing
> database information into MM, so as to not make people re-enter it.

Yes, that actually is the plan :) I need to put them side by side
though, so I can transform them. Is there a "picture" of the DB schema?

> You'll want to figure out how you want to handle updating mirror info
> - if you're going to centralize that as you do today, or if you want
> each mirror to be able to edit their own information (the Fedora
> model).  MM can work in either mode.  If mirror admins can do it
> themselves, you'll need some form of account system, either local to
> MM (which is present), or something that the TurboGears framework can
> hook into (such as the Fedora Acccount System).

Yeah, I'd like that people are able to edit their own info. Would be a
good test to see if contact info is uptodate, too :)

So let me try to find a different machine :/

Cheers, and thank you and Adrian again,