On Fri, Apr 22, 2016 at 12:31:31AM -0400, Chuck Anderson wrote:
On Fri, Apr 22, 2016 at 04:04:17AM +0300, Anssi Johansson wrote:
22.4.2016, 0.48, Chuck Anderson kirjoitti:
You could also try without --delay-updates, which also triggers this requirement to know the full file list in advance.
But not using --delay-updates means that the yum repo could be in an inconsistent/nonfuctional state until the sync finishes. That isn't good for a public mirror.
If also using --delete-delay, the files to be deleted will be deleted only at the end of sync, reducing the chances of anything breaking. Although I see your point, I believe the base repositories (os, updates, extras, fasttrack etc) would end up getting synced in such a sequence that yum won't get confused. The repodata directory tends to get synced last, and it is not harmful if there are .rpm files on the mirror that are not yet referenced in the repodata.
"tends to get synced last" isn't something to rely on.
Instead, you could sync everything except the repodata, then do a 2nd sync of just the repodata.
I should correct my statement above. The first sync should sync everything except the repodata/ directories, and not do any --delete. The second sync should then do --delay-updates on the full tree. There was a long discussion about this on the Red Hat mirror list in 2009. It is probably just easier to always use --delay-updates --delete-delay so you dont have to get things "just right" with the manual methods. You might also need to increase --timeout if you are having problems with timeouts...y
Some excerpts from the 2009 discussion:
rsync pulls files in sort order, so repodata comes before many packages. If you pull fast the time interval between repodata and all the following is short and the probability of mismatch is small. But if it takes longer, or there's a lot yet to pull after repodata, it may become a problem. Given the number of client updates, even a small fraction of misses becomes a big number over time, and users will complain.
A way around --delay-updates is to have a multi-pass rsync which first transfers rpms only w/o --delete*, then transfers everything w/o --delete* (repodata including rpms to avoid any racing between data and metadata) and finally does a full rsync w/ --delete* options (again full for avoiding racing problems). That's the way I used it before rsync (on my mirror) had the delay options.
A 2-pass is enough, just use delay-updates in the second one. It's much smaller so won't be a big hit and will be short enough to minimize incoherences.