On 10/10/2011 06:48 AM, Matt Domsch wrote:
On Fri, Oct 07, 2011 at 03:55:30PM -0500, Ralph Angenendt wrote:
As several people have offered to help moving our selfbotched system to mirrormanager (sorry Peter, but there were some offers - I am still not sure which one really is better technically, but the more helping hands, the better), I'd like to start this now.
Excellent. I'm happy to help in any way I can.
Thank you (and thank you Adrian), I'm just going to reply to this post.
Can anyone running a mirrormanager instance tell me, what kind of specs that machine needs to have?
There are three primary applications: 1) the web pages for mirror admins to login and change their mirror data, which on Fedora is using ~160MB RAM; 2) various cronjobs (~50MB); 3) the mirrorlist request handler (125MB). 1 and 3 can run on one, or many, machines. 2) by the nature of the jobs only runs on a single system.
MM maintains a local cache in /var/lib/mirrormanager. Fedora's copy thereof is 28MB.
Okay. And as we need to have the Database running on the same host (as the machines we have are rather spread around the world than being together in one nice and cozy data center) the machine I just "found" which has 512MB RAM seems to be to small to do that.
So first thing: Look for another machine :)
Does it need to hold a copy of the mirror?
Machine 2 above needs a copy of the mirror, yes, either local or NFS-mounted. Some aspects, like update-master-directory-list and the crawler, can simply grab an rsync listing from another system, rather than have the full mirror be mounted; however, the metalink generator does need to be able to read files, so having a full mirror nearby would be good. Fedora has the directory tree on an NFS-mountable volume, which the MM cronjob server mounts read-only.
Okay. Finding a machine which can hold the complete tree is trivial, all of ours can do that.
On the software side, I guess httpd and mysql-server (and mirrormanager)
- anything else? Or is the sqlite variant fast enough for the amount of
mirrors we have?
Because there are several actors updating the database simultaneously (particularly the update-master-directory-list cronjob and the crawlers), it's preferred to use postgres or mysql rather than sqlite. I haven't tried running with sqlite in production. MM uses TurboGears, which uses SQLObject currently (soonish SQLAlchemy) so you are free to choose whichever database backend you want.
Hmmm, good. If MySQL is an option - we run other instances of that, then there's no need to to run a postgres instance. MirrorManager doesn't take advantage of the inetnum (or what it is called) data type in postgres which is able to store IP addresses and CIDR data?
At the moment we run quite a few instances of mirrorlist.ceVntos.org - the machines which hand out the urllist to machines - is that possible with mirrormanager, too? Or will one machine be able to handle the load?
As Adrian noted, this is not only possible, but recommended. Fedora has ~8 servers handing out the mirrorlist by request, and they don't even blink at the load.
Fine. I think we can reuse the machines we have. How can that info be copied? At the moment I copy a tree which has the mirrorlist info for any given country/release/repo and a perl module which then grabs that info and gives it out. How does that work in mirror manager?
I would also recommend writing a script to convert your existing database information into MM, so as to not make people re-enter it.
Yes, that actually is the plan :) I need to put them side by side though, so I can transform them. Is there a "picture" of the DB schema?
You'll want to figure out how you want to handle updating mirror info
- if you're going to centralize that as you do today, or if you want
each mirror to be able to edit their own information (the Fedora model). MM can work in either mode. If mirror admins can do it themselves, you'll need some form of account system, either local to MM (which is present), or something that the TurboGears framework can hook into (such as the Fedora Acccount System).
Yeah, I'd like that people are able to edit their own info. Would be a good test to see if contact info is uptodate, too :)
So let me try to find a different machine :/
Cheers, and thank you and Adrian again,
Ralph