[CentOS-devel] Importing CentOS-6 Sources into git.centos.org

Fri Aug 22 02:13:14 UTC 2014
Nico Kadel-Garcia <nkadel at gmail.com>

On Thu, Aug 21, 2014 at 1:54 PM, Stephen John Smoogen <smooge at gmail.com> wrote:

> On 20 August 2014 19:29, Nico Kadel-Garcia <nkadel at gmail.com> wrote:
>> On Wed, Aug 20, 2014 at 5:24 PM, Karanbir Singh <mail-lists at karan.org>

>> > I'd like to be able to offer rsync from there as well, but there are a
>> > few challenges that need resolved first. For the binary content cache,
>> > we can likely run the rsync instance from the backup machine so there is
>> > no network load on the production box. For the git repos its a bit
>> > harder since there are private or working-in-progress repos in there as
>> > well, and we need to find a way to mask those out.
>> >
>> > Certainly worth trying to get to.
>> Use GPG signed git tags to assure provenance, and the repository can
>> be safely cloned. Rsyncing a git repo is like rsyncing a CVS or
>> Subversion reository. Even small changes in the midst of the rsync
>> operation can corrupt the underlying database.
> Could you please cut down the broken record please? There is no sign there
> will be GPG signed git tags in the near future and your constant harping on
> it is not going to make it happen any faster.

Sorry it bothers you. I didn't bring up using rsync to make mirrors.
I'm trying to get across the provenance and propagation problem.

Besides the potential corrupt snapshot problem, there's the inevitable
discrepancies between the mirrors and git.centos.org itself. Content
is likely to differ in small ways among the mirrors, due to the rsync
based snapshot being in the past.  I assume that some of the
individual repos are changing during the overall rsumc update period,
unless they're all done in parallel, which would be *really* nasty.

Unless.... Is there a top level directory to use for an rsync mirror?
That's going to be a pretty bulky rsync operation, with over 6000
subdirectories and the amount of churn in any of the modified git

Anyway, verification of the consistency of all the mirrored
repositories becomes awkward. There's also the lack of site
verification in the unencrypted and unsigned rsync protocol, which I'd
not even thought about for git.centos.org. That puts it right into the
"people cloning from each other's unsecured repos locally" world, in
this case cloning from the rsync mirrors. And it directly brings up
the "verify the provenance of local repos" problem that was discounted
by some when I brought up the problem earlier.

Several folks did bring up the point of "git.centos.org has an SSL
key, what's not secure about it?"  If we're using rsync mirrors, we're
relying on someone else's mirror site to be secure, as well. And we're
probably relying on unencrypted rsync to git.centos.org, itself, to
support those mirrors. And we're once again open to someone polluting
the data stream with a fake repo.

No, *if* our friends at git.centos.org want to help protect that data
stream for mirror clients, they can consider using something like
rsync with ssh keys and the old "validate-rsync.sh" script as a
ForceCommand. A site that wants to be a mirror would need a relevant
private SSH key, and unlocking it for rsync use is their problem. But
that would at least help assure provenance between the mirror sites
and git.centos.org.

The repercussions of using rsync for this start adding up pretty fast.