On Sat, 11 Jul 2009, Sascha Thomas Spreitzer wrote: > i have been asked from some companies, > > 1. how long it would take to rebuild a centos from scratch if I would > have a whole CP (common processor) for it. On a rather slow subinstance without much ram, it takes a bit over 30 hr to walk a single 'pass' through the build process and either do the rebuild of a given package, conclude it needs a dependency, or otherwise fail. As I am doing a rather naive solution algorithm, I have no doubt that this can be improved to reduce the number of passes until no furhter solution is attained I do not have a complete convergence yet, so I do not know the number of passes I will use. I have writen versions of this email before -- one to the RPM mailing list in 2001 comes to mind; I have published scripts doing variations of what I describe here in my ftp site at: ftp://ftp.owlriver.com/pub/mirror/ORC/buildfarm but they are not current nor maintained. Looking, the datestamps largely predate CentOS, and were for internal purposes or for cAos development work > 2. What the build process includes, meaning "steps to success" Many have outlined the process, and there is more than one way to do it. One 'bootstraps' from a running distribution, into the minimal subset needed to self host a build chroot. Then from that subset, one builds the build chroot again. Then one builds out toward desired leaf nodes, satisfing intermediate dependencies Consider trying an experiment and watch the process which the GNU folks use for GCC and GLIBC -- they bootstrap into a self building environment, and then build again with the new tools, and diff, to make sure the build is deterministic (using diff) and capable of self hosting. _Then_ you build out toward the leaf node packages. Lather, rinse, and repeat. It is similar here CentOS is fortunate that it rebuilds a known finite package set, rather than having to stabilize a packageset into a distribution. RPM and YUM also make it possible to easily query and build a map of all Requires/Depends mappings, so on can know that one has a package collection ( a 'repository' ) which has closure in the sense that all such Requires and Depends are satisfied How a package is built often determines what it will include -- the autotools and ./configure process are designed to 'inventory' what library headers are present and conditionally add features to a package -- that is: was 'tcp_wrappers' present, and if so, we see: /usr/include/tcpd.h This means that 'wrappers' support _can_ be added by a given package. That package might also be willing to build without wrappers support. [I chose this example, because occasionally I have seen a distribution stabilizer omit this particular package from the build environment of a candidate, and one assumes inadvertently, cause such support to be omitted] This presence or omission can be spotted a couple of ways -- by reading and anlyzing build logs (which is a mind numbing task, and requires a specific awareness of what is 'right'); -or- by using 'ldd' and other tools to examine a binary file to see what libraries it calls for Again, CentOS has an easier task, as we can compare the ldd results for each binary from a 'real' upstream product, to our rebuild effort's candidate. This mailing list has pointed to tools to do just that which the CentOS project have released. CentOS developers can take an easier route, because we have a well defined CentOS goal: Reproduce the upstream binaries, warts and all (without encumbered trademark; and adding our own art trademarks and other copyrighted matter) and attending to the changes needed for the updater [Recall that CentOS 2.1, 3, and early 4 did not have the packages needed for the 'yum' approach the project uses; there has been a bit of a stink by an EPEL ignorant of the timing and package entry process which 'yum' and 'sqlite' followed into RHEL in recent days]. CentOS used yum in part becasue of the people involved at the time and the active development it was under in the RHL 8 and 9 days; also the sources for the server side required by 'up2date' were unavailable, etc > 3. whether the licenses force a maintainer to push the resulting > distribution upstream. So far as I know, CentOS has never been _asked_ to do so and is under no obligation to provide its binary product to anyone. That said, clearly many upstream use CentOS' product > 4. In which cases the maintainer is liable for the resulting > distribution rpms, binaries, etc... You will have to consult comptent counsel for your jurisdiction, as matters of liability are out of scope as to matters upon which I will opine here > It would be VERY helpful if I can clarify those questions. It might > spend us 1 CP. I have requested access to such 'real hardware' builders here, and in the Marist s390 list, and by private communication to potential donors, and been greeted with a deafening silence as to offers. As I noted in my earlier post, IBM came through for me, and I restarted another pass before composing this reply. With my best regards, -- Russ herrold