On 08/22/2014 03:13 AM, Nico Kadel-Garcia wrote:
Besides the potential corrupt snapshot problem, there's the inevitable discrepancies between the mirrors and git.centos.org itself. Content is likely to differ in small ways among the mirrors, due to the rsync based snapshot being in the past. I assume that some of the individual repos are changing during the overall rsumc update period, unless they're all done in parallel, which would be *really* nasty.
it does not matter... the content on the binary cache is hash'd, have you looked at how things are setup ?
Unless.... Is there a top level directory to use for an rsync mirror? That's going to be a pretty bulky rsync operation, with over 6000 subdirectories and the amount of churn in any of the modified git repos.
I dont understand that statement, are you questioning rsync's ability to handle 6k dirs ?
Anyway, verification of the consistency of all the mirrored repositories becomes awkward. There's also the lack of site
to be clear, I dont think the aim here is to setup content mirrors for general consumption, the aim is to have a rsync target that lets people run their own mirrors. And we dont need any real sync between git and binary sources - since they are tracked in git as hash'd objects. Something missing will get flagged up right away ( or corrupt )
I realise we have an issue where some of the hash's are sha1's and others are sha256's and the checking code, client side, needs to check lenght and use the right algo - but thats something which should get fixed as we all end up using the same tools and convention.
verification in the unencrypted and unsigned rsync protocol, which I'd not even thought about for git.centos.org. That puts it right into the "people cloning from each other's unsecured repos locally" world, in this case cloning from the rsync mirrors. And it directly brings up the "verify the provenance of local repos" problem that was discounted by some when I brought up the problem earlier.
Several folks did bring up the point of "git.centos.org has an SSL key, what's not secure about it?" If we're using rsync mirrors, we're relying on someone else's mirror site to be secure, as well. And we're probably relying on unencrypted rsync to git.centos.org, itself, to support those mirrors. And we're once again open to someone polluting the data stream with a fake repo.
right, so the confusion comes from other-mirrors, thats certainly not the aim here. its all for local consumption. And I dont know what is involved in getting rsync around a ssl wrapper. But the fact that metadata in the git repos' has the corrosponding hash's should be good enough for validating per file. Doing this for the entire tree, every possible piece would be quite hard, admittedly.