On 2/24/2011 8:59 AM, Lamar Owen wrote: > On Wednesday, February 23, 2011 11:11:57 am Les Mikesell wrote: >> Aren't there some tools that are designed to find binary similarities >> (think anti-virus or things that try to detect copied code sections)? > >> Can't whatever they are comparing here be used directly in some clever >> test to predict what would be needed instead of a trial and error approach? > >> Even if no one can come up with a way to extrapolate the dependencies >> from the RHEL binary, the task seems like something that would scale >> with more build machines and more people to stage the trial-and-error >> builds. > > Please see if what you're talking about is found in http://mirror.centos.org/centos/build/ (and note the dates). > > Note that the built results of some of those SRPMs will be needed for later SRPM builds, and those may need to be done in a deterministic order. IOW, not something easily parallelized. So you want me to build gcc starting from nothing but SRPMs? I think the premise is wrong here. > And do note that some of the packages take a very long time to build; things like KDE, for instance. So if multiple iterations have to happen on a package that takes a long time to build..... > > Here's you some homework, Les: build a 'builddep solver' that will tell you in what order the packages need to be built, starting from the bare minimum buildroot. Of course, you don't necessarily know the bare minimum buildroot, and the upstream isn't telling you. Again, your premise is wrong. I'm not claiming _I_ can solve this problem. Any more than I could have added all the parts to Linux that Linus didn't write by himself. My contention is that if Linus had made people prove they were worthy before letting them have a copy of his work. we wouldn't be talking about Linux as something useful today. And likewise, if enough smart people have access to each others' work on this problem, someone will improve the state of the art. But, I don't see the point of a bare minimum buildroot anyway. > And make it match the released upstream at the binary dependency level. It's not enough to just get the packages to build; the goal is binary and bug-for-bug compatibility. Which means you have to have access to all of the already-built binary RPMS to test against and since you aren't going to change the SRPMs to fix any missing dependencies, you might as well throw them all in your build environment from the start, then repeat later in an environment built from your first-run output to purge any possible static inclusions of the trademarked parts. The hard parts come when the shipping RHEL versions won't actually rebuild a match to their binary and you have to guess at whether some fedora or RHEL 5.x version of some needed library happened to be available to the original build without the matching dependency noted in the SRPM or the inclusion of that library in the binary RPMs. If I've gotten that wrong, that still doesn't change my belief that either someone, somewhere is smart enough to come up with an automated approach or that a lot of people using a trial-and-error approach on a larger number of machines could do it faster. But I'm willing to admit that is approaching religion even though there is a lot of history to back up the idea of getting unexpected improvements by opening projects. -- Les Mikesell lesmikesell at gmail.com