[CentOS-devel] progress?

On 2/24/2011 8:59 AM, Lamar Owen wrote:
> On Wednesday, February 23, 2011 11:11:57 am Les Mikesell wrote:
>> Aren't there some tools that are designed to find binary similarities
>> (think anti-virus or things that try to detect copied code sections)?
>
>> Can't whatever they are comparing here be used directly in some clever
>> test to predict what would be needed instead of a trial and error approach?
>
>> Even if no one can come up with a way to extrapolate the dependencies
>> from the RHEL binary, the task seems like something that would scale
>> with more build machines and more people to stage the trial-and-error
>> builds.
>
> Please see if what you're talking about is found in http://mirror.centos.org/centos/build/ (and note the dates).
>
> Note that the built results of some of those SRPMs will be needed for later SRPM builds, and those may need to be done in a deterministic order.  IOW, not something easily parallelized.

So you want me to build gcc starting from nothing but SRPMs? I think the 
premise is wrong here.

> And do note that some of the packages take a very long time to build; things like KDE, for instance.  So if multiple iterations have to happen on a package that takes a long time to build.....
>
> Here's you some homework, Les: build a 'builddep solver' that will tell you in what order the packages need to be built, starting from the bare minimum buildroot.  Of course, you don't necessarily know the bare minimum buildroot, and the upstream isn't telling you.

Again, your premise is wrong. I'm not claiming _I_ can solve this 
problem.  Any more than I could have added all the parts to Linux that 
Linus didn't write by himself.  My contention is that if Linus had made 
people prove they were worthy before letting them have a copy of his 
work. we wouldn't be talking about Linux as something useful today.  And 
likewise, if enough smart people have access to each others' work on 
this problem, someone will improve the state of the art.  But, I don't 
see the point of a bare minimum buildroot anyway.

> And make it match the released upstream at the binary dependency level.  It's not enough to just get the packages to build; the goal is binary and bug-for-bug compatibility.

Which means you have to have access to all of the already-built binary 
RPMS to test against and since you aren't going to change the SRPMs to 
fix any missing dependencies, you might as well throw them all in your 
build environment from the start, then repeat later in an environment 
built from your first-run output to purge any possible static inclusions 
of the trademarked parts.  The hard parts come when the shipping RHEL 
versions won't actually rebuild a match to their binary and you have to 
guess at whether some fedora or RHEL 5.x version of some needed library 
happened to be available to the original build without the matching 
dependency noted in the SRPM or the inclusion of that library in the 
binary RPMs.

If I've gotten that wrong, that still doesn't change my belief that 
either someone, somewhere is smart enough to come up with an automated 
approach or that a lot of people using a trial-and-error approach on a 
larger number of machines could do it faster.  But I'm willing to admit 
that is approaching religion even though there is a lot of history to 
back up the idea of getting unexpected improvements by opening projects.

-- 
   Les Mikesell
    lesmikesell at gmail.com