On Thu, Aug 21, 2014 at 1:54 PM, Stephen John Smoogen smooge@gmail.com wrote:
On 20 August 2014 19:29, Nico Kadel-Garcia nkadel@gmail.com wrote:
On Wed, Aug 20, 2014 at 5:24 PM, Karanbir Singh mail-lists@karan.org
I'd like to be able to offer rsync from there as well, but there are a few challenges that need resolved first. For the binary content cache, we can likely run the rsync instance from the backup machine so there is no network load on the production box. For the git repos its a bit harder since there are private or working-in-progress repos in there as well, and we need to find a way to mask those out.
Certainly worth trying to get to.
Use GPG signed git tags to assure provenance, and the repository can be safely cloned. Rsyncing a git repo is like rsyncing a CVS or Subversion reository. Even small changes in the midst of the rsync operation can corrupt the underlying database.
Could you please cut down the broken record please? There is no sign there will be GPG signed git tags in the near future and your constant harping on it is not going to make it happen any faster.
Sorry it bothers you. I didn't bring up using rsync to make mirrors. I'm trying to get across the provenance and propagation problem.
Besides the potential corrupt snapshot problem, there's the inevitable discrepancies between the mirrors and git.centos.org itself. Content is likely to differ in small ways among the mirrors, due to the rsync based snapshot being in the past. I assume that some of the individual repos are changing during the overall rsumc update period, unless they're all done in parallel, which would be *really* nasty.
Unless.... Is there a top level directory to use for an rsync mirror? That's going to be a pretty bulky rsync operation, with over 6000 subdirectories and the amount of churn in any of the modified git repos.
Anyway, verification of the consistency of all the mirrored repositories becomes awkward. There's also the lack of site verification in the unencrypted and unsigned rsync protocol, which I'd not even thought about for git.centos.org. That puts it right into the "people cloning from each other's unsecured repos locally" world, in this case cloning from the rsync mirrors. And it directly brings up the "verify the provenance of local repos" problem that was discounted by some when I brought up the problem earlier.
Several folks did bring up the point of "git.centos.org has an SSL key, what's not secure about it?" If we're using rsync mirrors, we're relying on someone else's mirror site to be secure, as well. And we're probably relying on unencrypted rsync to git.centos.org, itself, to support those mirrors. And we're once again open to someone polluting the data stream with a fake repo.
No, *if* our friends at git.centos.org want to help protect that data stream for mirror clients, they can consider using something like rsync with ssh keys and the old "validate-rsync.sh" script as a ForceCommand. A site that wants to be a mirror would need a relevant private SSH key, and unlocking it for rsync use is their problem. But that would at least help assure provenance between the mirror sites and git.centos.org.
The repercussions of using rsync for this start adding up pretty fast.