[CentOS-mirror] rsync lock files

Fri Mar 8 19:09:30 UTC 2019
David Richardson <david.richardson at utah.edu>

I'm quoted! I'm famous! :)

There's two parts to what was quoted below:

1) Use flock to make sure you only have one update running. It'll do the
    right thing if something happens and the job exits abnormally. Someone
    else has already written a tool, so use their work instead of
    reinventing it yourself.

2) Tune vm.vfs_cache_pressure. In my experience, this makes the operations
    where you care about interactive performance (ls, du, find, rsync
    updates from upstream) MUCH faster, and has little impact on
    bulk-transfer performance.

Thanks,
DR

-- 
David Richardson <david.richardson at utah.edu>
Center for High Performance Computing
University of Utah



On Fri, 8 Mar 2019, Dattatec Mirrors wrote:

> El Viernes 08/03/2019 a las 04:42, Patrick Shaw escribió:
>> Hi,
>>
>> Does anyone have some examples of lock scripts/files for rsync? Due to
>> slow international performance in this country I'm regularly seeing
>> overlapping rsyncs, I need to get that locked down ASAP.
>>
>> Patrick
>
> Hi, I'd advise against using a lock file as f your scripts dies for whatever
> reason, the lock file might linger around and prevent future rsyncs from
> running.
>
> Let me qoute a mail from David Richardson to Fedora's mirror mailing list with
> an alternative involving flock (last part) and some more general tips that
> might be useful for mirror admins:
>
> "I had the same problem you did with rsync taking forever (and find, and ls,
> and httpd).
>
> Changing the sysctl vm.vfs_cache_pressure made a night-and-day difference
> (default is 100, I set it to 10).
>
> vm.vfs_cache_pressure controls caching of inode data versus file contents.
>
> The default (and centerpoint) is 100. Values less than 100 favor inode data,
> values greater than 100 favors file contents. Do NOT set it to zero.
> My understanding that if you set it to zero, bad things will happen and you
> will eventually OOM.
>
> With this change, all my metadata stays in cache. I have two million inodes in
> use, and this setting costs me about 4GB of RAM. A no-change Fedora rsync
> takes 20 seconds for 425GB of content in 950k files (I exclude development
> and SRPMS).
>
> I use --delete to handle the .~tmp~ directories. If one of my runs aborts, the
> next run will clean up after it.
>
> My script is basically rsync wrapped with flock (rather than trying to cobble
> together a lock-file system).
>
> The 200 in the flock command (and again at the end) is just a filehandle
> number; it doesn't really matter what it is, as long as nothing else uses it.
> The file name at the end also doesn't much matter. The file needs to be
> writeable (or creatable if it doesn't exist), but nothing is written to it.
> There's also no need to remove it afterwards.
>
>
> ### SCRIPT BEGINS ###
> (
> flock -n 200 || { echo "Script is already running. Aborting." ; exit 1 ; }
> # ... commands executed under lock ...
>
> /usr/bin/rsync --progress -aHv --update --delete --delete-excluded \
> --delete-after --delay-updates rsync://your/source /your/dest/path
>
> /usr/bin/report_mirror
>
> ) 200>/tmp/lock.update-fedora
> ### SCRIPT ENDS ###
>
> Hope you find this useful!"
>
>
> BR,