On Fri, Oct 07, 2011 at 03:55:30PM -0500, Ralph Angenendt wrote:
Hey,
long time no infrastructure changes (well, on the CentOS side, quite a few on my side).
As several people have offered to help moving our selfbotched system to mirrormanager (sorry Peter, but there were some offers - I am still not sure which one really is better technically, but the more helping hands, the better), I'd like to start this now.
Excellent. I'm happy to help in any way I can.
First question: I guess as a first move we need a machine to host that on.
Can anyone running a mirrormanager instance tell me, what kind of specs that machine needs to have?
There are three primary applications: 1) the web pages for mirror admins to login and change their mirror data, which on Fedora is using ~160MB RAM; 2) various cronjobs (~50MB); 3) the mirrorlist request handler (125MB). 1 and 3 can run on one, or many, machines. 2) by the nature of the jobs only runs on a single system.
MM maintains a local cache in /var/lib/mirrormanager. Fedora's copy thereof is 28MB.
Does it need to hold a copy of the mirror?
Machine 2 above needs a copy of the mirror, yes, either local or NFS-mounted. Some aspects, like update-master-directory-list and the crawler, can simply grab an rsync listing from another system, rather than have the full mirror be mounted; however, the metalink generator does need to be able to read files, so having a full mirror nearby would be good. Fedora has the directory tree on an NFS-mountable volume, which the MM cronjob server mounts read-only.
On the software side, I guess httpd and mysql-server (and mirrormanager)
- anything else? Or is the sqlite variant fast enough for the amount of
mirrors we have?
Because there are several actors updating the database simultaneously (particularly the update-master-directory-list cronjob and the crawlers), it's preferred to use postgres or mysql rather than sqlite. I haven't tried running with sqlite in production. MM uses TurboGears, which uses SQLObject currently (soonish SQLAlchemy) so you are free to choose whichever database backend you want.
At the moment we run quite a few instances of mirrorlist.centos.org - the machines which hand out the urllist to machines - is that possible with mirrormanager, too? Or will one machine be able to handle the load?
As Adrian noted, this is not only possible, but recommended. Fedora has ~8 servers handing out the mirrorlist by request, and they don't even blink at the load. In general, each mirrorlist request is answered within 0.3s, of which <0.1s is spent in the mirrorlist code, the rest is just setting up TCP connections and getting through the load balancers.
I would also recommend writing a script to convert your existing database information into MM, so as to not make people re-enter it.
You'll want to figure out how you want to handle updating mirror info - if you're going to centralize that as you do today, or if you want each mirror to be able to edit their own information (the Fedora model). MM can work in either mode. If mirror admins can do it themselves, you'll need some form of account system, either local to MM (which is present), or something that the TurboGears framework can hook into (such as the Fedora Acccount System).
Thanks, Matt