Hi everybody,
[resending, after realizing that I was subscribed with an old address]
On Wed, Oct 27, 2010 at 11:31:56PM +0200, Ralph Angenendt wrote:
There is a wiki page for that process now. I put down the notes I took at the meeting for now. There's also a log of the IRC meeting, which I want to redact a bit first, as there is some off topic chatting in there (and several joins/leaves during the meeting). I won't have time for that before friday, though.
Here's the page, which will fill up with more information:
I like to thank the people who were there and gave us input about other solutions (and questioned why we do things like we do).
I would also like to thank you for the good meeting, and also for considering MirrorBrain.
This mail is very long -too long-, which I would like to apologize for, but I thought it would be good to provide a comprehensive overview of the options that I see.
First off, I think you can't go wrong if you go with MirrorManager, because it works for Fedora, and it already has support for the somewhat more special requirement that you have, which is yum mirror lists. The similarity of Fedora and Centos might make many things easier. MirrorBrain doesn't have this yet, because none of its users needed it so far. As MirrorBrain tries to be a generic solution, it is generally agnostic of project or metadata structure, and does everything on file level. That doesn't mean that support for "special" features is unwanted, of course. Especially if it can be implemented in a way that it fits into the concept, and doesn't make deployment for other users more difficult. It is certainly a nice option - there are many Yum-based distros, after all.
(background: Being usable not only by Linux distros is a declared goal of the MirrorBrain project, in order to get as many users (and potential developers) into the boat and collaborate.
For a mirroring infrastructure, I believe that only collaboration across organization borders can yield a mature, flexible and long-lived solution. And there are not really many people working on this, only a handful. It would be cool to merge MirrorBrain and MirrorManager somehow. Might be a lot of work but useful in the long-term. )
Having said all that, I thought that Yum mirrorlist in MirrorBrain should not be hard to implement. I spent some time on it today and got quite far; configuring mapping of URL query arguments to directories/files is done, and actual mapping works. I chose Apache config as vehicle for that, and the following is a working config:
MirrorBrainYumDir release=(5.5) \ repo=(os|extras|addons|updates|centosplus|contrib) \ arch=x86_64 \ $1/$2/x86_64 repodata/repomd.xml
For instance, $1/$2/x86_64 is the base URL to a repository, and the match groups can optionally be replaced with what the client specified to the query arguments. ($1 is the first group from the configuration line, $2 the second, and so on. The names and number of query args are all arbitrary.) The last argument is a relative path, and the file that must be present on eligible mirrors. The resulting path here would be e.g. 5.5/os/x86_64/repodata/repomd.xml, and the client would get a list of mirrors in the form of http://mirror.example.com/path/to/centos/5.5/os/x86_64/ (That's what's missing to be implemented, but it's the easiest part :-) So I'm confident that I can promise Yum mirror list soon. Maybe I can finish it this week, maybe the week after, I don't now.
Meanwhile, I would appreciate input from you: is this reasonable? Would it serve your needs?
If it does, I think the only feature in missing in MirrorBrain for you would be sorted out.
(Needless to say that the mirror list that yum gets will be sorted by suitability of the mirrors)
So, on to the other issues that were raised in the meeting.
Summarizing what I heard, the following are the problems that you would like to solve:
1) scalability 2) cleaning up the historic DVD/nonDVD setup 3) partial mirroring 4) finer mirror selection (by prefix, autonomous system, state/region, in addition to country/continent) 5) consistency problems 6) content verification 7) (presumably) backwards compatibility to existing installations 8) (maybe) satellite setups
1) scalability
The dimensions are: - 70.000 files in 500 directories - >400 mirrors - 40 requests per second
Sounds fine from my point of view. MB has handled more files, and more requests. The number of mirrors I have run it with was smaller, 150 at most, but I wouldn't expect big problems. The little mirrorprobe that runs every minute might run into a system limit when starting 400 threads, to check all mirrors at the same time, so maybe it needs to be tweaked, or changed to a different model, using a pool of threads or starting some processes as well.
2) cleaning up the historic DVD/nonDVD setup
Sounds like a good idea :-)
3) partial mirroring
Supported well by MirrorBrain.
4) finer mirror selection (by prefix, autonomous system, state/region, in addition to country/continent)
MirrorBrain uses BGP/routing data to find out the network prefix and AS of clients and mirrors, and matches them. Other criteria are GeoIP country and continent. The closest match is used for mirror selection. If several mirrors are there to choose from, a weighted randomization is also applied, to be able to give some mirrors more requests and others less. We talked in our meeting about the need for a smarter selection in e.g. the US, where one doesn't want to be sent from one coast to the other. GeoIP regions were discussed for this. I considered going that route, but decided to implement a different concept, which I believe is more widely useful, because it works also when no mirror within the same state/region is found: using geographical distance between the client and the mirrors. I just released this new feature into the wild: http://mirrorbrain.org/news/2140-takes-geographical-distances-account/ You can try it out http://download.services.openoffice.org/files/stable/3.2.1/OOo-SDK_3.2.1_Lin... and feedback is appreciated.
5) consistency problems
Regarding problems with consistency of trees on mirrors / clients accessing them, this is indeed a hard problem to solve. From discussions with Fedora people I know that they also have/had major fights with that. It took me a long time to finally get this sorted out when I still worked on the openSUSE infrastructure. The following have proved useful for me in the past:
- Always take care of setting appropriate cache headers. must-revalidate is the key, because it doesn't prevent caching, but causes clients (and intermediaries) to always validate that a resource is still fresh.
It is hopeless to get all mirrors to run the same configuration in this regard, and there are also some FTP mirrors (and FTP doesn't have a feature to control caching at all), so for certain content, there is no other option than delivering it from defined places _with_ proper headers. Luckily, this concerns mostly small metadata files.
This is against inconsistency as it happens when things come from different places (different age). If cache control is not exerted by the server (or client), intermediaries (web caches) commonly "guess" how long they should deliver stuff from their cache, without revalidation freshness. Typically, a squid assumes freshness for 4-18 hours by default, and the exact time is hard to predict, because cache pruning is complex and may take file size into account. Thus, it is inevitable that clients see an inconsistent picture.
- The second (and even more important) measure is to version metadata. Actually, any data. Always and Everywhere. With RPMs, one is in the lucky situation that this is usually done anyway (reliably increasing version/release numbers with each rebuild). Exception files like "MD5SUMS" definitely need to be treated separately and should never be redirected to a mirror, not only for security reasons. repo-md metadata, as used by Yum, exists in various incarnations. Unfortunately, the ones I dealt with in the past were not versioned, and files had names like "filelists.xml.gz", which leaves only non-redirection as the only 100% solution. (So I did that.) Nowadays, at least the repo-md metadata that the Fedora and openSUSE people build is versioned, as can be seen in this example: http://download.opensuse.org/repositories/Apache:/MirrorBrain/Apache_openSUS... I suppose that createrepo does that these days. Anyway, this is certainly a point where tight cooperation (and appropriate input) with the build system folks is very important.
- A third line of "defense" can be a client that double-checks itself that it doesn't get old metadata, by checking with cryptohashes if the download is the expected one, _and_ falls back to a different mirror if it isn't the case. That's what Yum does, since MirrorManager sends hashes/timestamps via Metalinks, and what Zypper does, since it uses a Metalink client for all downloads that allows it to fall back to other mirrors until it got the expected data. You won't be able to do something fancy like that with CentOS 5 I guess, but maybe with the next version. (Actually, it's not that difficult to teach Yum using a Metalink client -- I once tried it out, and it was a one-liner to replace its usage of python-urlgrabber with a call to aria2c (powerful Metalink client) for all downloads. Another great option would be to extend python-urlgrabber to be a Metalink client.)
That's what I learnt anyway... maybe some of it can be useful to you. Verifying 400 mirrors in realtime is no option, with our limited means, IMO -- simply not doable. Of course, if anyone knows how to do that, I am *very* interested :-)
6) content verification
Regarding content verification: I don't know how you currently check exactly, but what can be done with MirrorBrain is: - there currently is a tool for downloading a file from one or all mirrors and displaying a hash of it. - this obviously doesn't work well for huge files (DVDs) (if it's not about a close, fast mirror). - since recently, MB can keep all hashes of all files in a database. The hashes include block (piece-wise) hashes. It would be fairly easy to fetch the hash of a random block (or a defined one) and download just that piece from all mirrors. (Since the hashes in the database are retrievable from everywhere, such checkers could also run _very_ distributed in fact.) If you look at http://download.documentfoundation.org/libreoffice/testing/3.3.0-beta2/rpm/x... there is various metadata, including the block hashes inside the linked IETF Metalink in the form of XML.
I'm open (and happy) to implement more means of content verification. So far, I either didn't have more need for it, or time was lacking. But it would be very useful. I just would like to point out that I see a need for it mainly for debugging purposes, when something goes wrong, and not as a security measure. Content verification is too easy to spoof as to significantly trust it. It is much more important to give clients the top hash from a trusted source, maybe even over TLS-encrypted web server, and rely on cryptographic signatures for the rest (which is easy with RPM, luckily).
In the context of file-tree consistency and content verification, I should note that verifying only certain critical files might not prevent that a mirror is "half synced", and thus inconsistent. I think that running something after syncing is a smart way to discover the moment when the mirror is "ready". That's where MirrorManager is very clever.
I wondered if there is a crucial file that can be used as "marker" to determine whether a mirror is up to date or not. A timestamp file might work, but maybe there need to be several of them, in different parts of the tree, if some setups are complex and sync parts of the tree with different scripts. MirrorBrain can also download files from mirrors (to look at the timestamp content), but that said, one wouldn't want to disable a mirror necessarily when it hasn't synced since a day, when it is still up to date (when no new content has come, except new timestamps). Or how do you handle this?
I was tossing around the idea whether the mirror scanner should integrate such a timestamp check, maybe comparing the timestamp of a certain "marker" file on the mirror with the known timestamp in its database. But I'm not clear yet where this would lead and how it could be made useful.
...Maybe the mirror scanner should simply check all repodata/repomd.xml files in the tree frequently, comparing with the current version. With the yum mirrorlist implementation described above, it would be easy to have only mirrors end up on the lists that are known to have the current file.
7) backwards compatibility to existing installations
I don't see an issue, once mirror lists work. However, I know much to few things about CentOS. :-)
BTW, one idea for the future, that I would like to at least mention, is that you could change Yum to contact a/the redirector for each request, instead of only in the beginning. I cannot judge if that would be better or worse -- I use Yum since many years, but always in that mode, and not with the mirror lists that you guys use. Anyway, that would give you more control over what Yum downloads where, let alone because of the ability of exerting proper cache control. It's also good for security if critical hash files (those containing the top hash) are downloaded from a trusted server only.
8) (maybe) satellite setups
Here I didn't get the details.
Curious what you think about all this.
Again, sorry sorry sorry for the long mail.
Thanks, Peter