Jeff Johnson wrote:
On Jul 24, 2011, at 4:35 PM, Ljubomir Ljubojevic wrote:
Oh, yeah, yum reads and process xml files, not actual files, so searches are fast because of it.
Here's something that might help you:
Using xml is a significant performance hit: see recent patches to yum/createrepo to use sqlite instead of xml … lemme find the check-in claim … here is the claim http://lists.baseurl.org/pipermail/rpm-metadata/2011-July/001353.html and quoting
Tested locally on repodata of 9000 pkgs.
Goes from 1.8-> 2GB of memory in use with the old createrepo code to 325MB of memory in use - same operation - performance-wise it is not considerably different. More testing will bear that out, though.
So -- if I believe those numbers -- there's *lots* of room for improvement in yum ripping out xml and replacing with a sqlite database. Note that createrepo != yum but some of the usage cases are similar. The general problem in yum (and smart and apt) is the high cost of the cache load, and the amount of aml that must be parsed/read in order to be cached. Adding a sqlite backing store which can just be used, not loaded, is a win.
You have mistaken createrepo with yum repomd data. Createrepo is for creating actual repository (I use mrepo).
Yum data (repomd, repoview) is different story. Every repository stores data in xml file packed with tar. They are unpacked in memory and xml data is parsed and put into internal database (and cache). It is very much possible that yum internally (for cache) uses sqlite database, haven't had the need to research. Using "yum -C <command>" will use yum cache rather then download repomd data again.