[CentOS-devel] Updates from today

On Fri, Mar 11, 2011 at 3:47 AM, Johnny Hughes <johnny at centos.org> wrote:
> On 03/10/2011 07:55 PM, Nico Kadel-Garcia wrote:
>> On Thu, Mar 10, 2011 at 7:18 PM, Johnny Hughes <johnny at centos.org> wrote:
>>
>>> Why do you keep talking about a SCM system.  Everything you want to know
>>> is in the SRPMS.  If you want to create a git repo of them, have at it.
>>>  You like SVN better, use it.  CVS your thing, use that.  Look for
>>> .centos files and pull them in (that and the kernel is all we change).
>>>
>>> I work with SRPMS, not with an SCM system.  I like SRPMS, they are a SCM
>>> system of their own.
>>
>> Because they're really not. Patches can be altered, and .spec files
>> altered, without any logging or notification of the change. Release
>> numbers and revision numbers are hard-coded, not trackable.
>>
>>> We do not change what upstream has in their SRPMS (except when we have
>>> to) ... we don't even unpack them unless we need to change them.  We
>>> submit them to mock to build.  Every patch we create, every change we
>>> make, it is in the SRPM.
>>
>> That's..... a pretty odd approach. Not inconceivable, but *exactly*
>> the sort of informaiton not in the "do it yourself, it's easy"
>> approach.
>>
>>> Why is this so hard to understand?
>>
>> Because it's amazingly poor software management. SRPM's are binaries
>> and make change tracking quite awkward, and rely entirely on the
>> developer to consistently report changes in the %changelog.
>> That's..... really awkward.
>
> It is not awkward at all and it does not require anything.
>
> diff -uNrp <original>.spec <modified>.spec > spec.diff
>
> diff -uNrp SOURCES.old SOURCES > sources.diff

No, the extraction of the contents of the SRPM's into distinguishable
directories is quite inefficient and has a great deal of unnecessary
overlap. Normally, I would use "rpm -U foo.src.rpm" to extract the
contents, but doing that with SRPM's can store files that have changed
and overlap each other, such as README and init scripts.

I'm going to explain this one for the folks new to comparing SRPM's.
For this case, you can use something like:

      mkdir foo1 foo2
      (cd foo1 && rpm2cpio ../foo1.src.rpm | cpio -id)
      (cd foo2 && rpm2cpio ../foo2.src.rpm | cpio -id)
      diff -u foo1/*.spec foo2/*.spec
      diff -ur foo1 foo2 --exclude=*.spec

This is workable, but when you start looking at the minor changes
between releases, it gets awkward, and to check the history of the
changes, you have to check the "* Changelog" in the .spec files. I
don't know *anyone* else who does those manually on a big project,
rather than pulling them from the source control logs.

> Now you have everything.

*NOW* you have everything cleanly accessed and separated, but you
don't have the logs of the requisite change, nor if changes go
together as sets.

> People making the SRPM don't have to REPORT anything, you see it right
> there.  The vast majority of our changes get rolled in over and over
> exactly as they are after the first time we create them.  We are not
> making technical changes.  PatchA just gets moved from the old one to
> the new one and reapplied, etc.  The whole goal is NOT to change anything.

But since you don't have source control, you don't actually know that
except by manually going back and disassembling the SRPM's.

> Red Hat does not give us an SCM to look at, yet we seem to be able to
> build the software.

Well, yes. But they're not inviting people into the OS building
process, nor saying "where is all our help!!!????"

>>> If we were maintaining changes in 2500 SRPMS per distribution (times 3
>>> or 4 distributions), we would do it in an SCM program, but since we just
>>> BUILD the vast majority of these packages without changes, maintaining
>>> an SCM of 10,000 packages when we change less than 1% of them does not
>>> make much sense.
>>
>> No, no, you'd just SCM the ones  you alter, and the build system
>> (which needed design to provide a bootstrappable environment.)

> We build software, we are not it the business of teaching you how to
> build software or producing a something that makes it so everyone can
> rebuild their own distribution.  Our goal is not a reproducable system
> so YOU can build software, it is for US to produce software.  If you are
> looking for a distribution that teaches YOU to build things, get Gentoo
> or Linux From Scratch.

Johnny, I *teach* people how to build software in RPM structures, in
source control, and to stabilize their build environments. CentOS has
been a great teaching environment, it's partly why I work with it. But
now you've got me nervous about what's going on upstream.

>>> You have the SRPMS, you have example config files, you have the mock
>>> that we use, you have the script that we build the software tree with,
>>> you have the file that we use to compare RPMs with upstream.  Those are
>>> what we use.
>>>
>>>> I've really been hoping for public access to the build structure. "You
>>>> can do it yourself" is not as helpful as the kind of public access to
>>>> build structures that Dag publishes, and has been suggesting.
>>>
>>> The build structure is NOT necessarily a public machine.  The machines
>>> that get built on do not necessarily belong to CentOS.  My company, for
>>> example, provides some resources that I build on.  You can not have
>>> access to my company's internal network or their machines.
>>
>> Excuse me, I didn't say it should be. But access to the /etc/mock
>> files, *in the SCM I just described*, would be helpful.

> The mock files are default and point to the default CentOS trees.  I
> gave you an example mock file.

Oh, dear. Now I see a problem.

Take a good look at the "extras" repository in the default "mock"
setups. those point to the EPEL repository, and are enabled for the
entire EPEL repository. This means that you may be introducing
dependencies that have *nothing* to do with your primary or local
codebase of CentOS. EPEL avoids overwriting RHEL dependencies, but
that's not the same thing as introducing parallel components that
would provide the same functionality and create separate dependency
trees.

That's been a problem with centosplus, epel, and rpmforge when they
have overlapping contents but differening dependencies. This way lies
madness.

>>> Dag changes SRPMS and source code ... we rebuild someone else's source
>>> code.  That is why we don't maintain an SCM.
>>
>> But you do change them! By your own admission above, you've altered
>> 100 packages. That's plenty to justify an SCM.
>
> People are already bitching that it TAKES TOO LONG to get the software
> and what you want to do is add more things to the process to make it
> easy for you to reproduce what we do.  This conversation is about what

I want it easier to for us to help, to knock in 3 or four SRPM's when
we can pry time free. As it is, we've got to devote days to build up
our own bootstrapped binary repository, and trip over the same
failures you guys are encountering, instead of being to use your fixes
from in-house SRPM's that *you haven't published*, so I can't compare
notes and avoid stepping on them.

> makes it easier for you to rebuild the upstream sources and has nothing
> to do with what the purpose of the CentOS Project is.  What you need to
> do is start your own project called "Enterprise Linux from Sources" and
> your goals need to be to design, maintain, and teach someone exactly how
> to rebuild the upstream sources ... those are not the goals of CentOS.
> It would be a good project, it is just not THIS project.

I don't have the physical resources in my home, and the start up costs
would be ridiculous.