[CentOS] Rsync/SSH automation problem?

Fri May 1 16:22:00 UTC 2009
nate <centos at linuxpowered.net>

Gordon Messmer wrote:
> Kai Schaetzl wrote:
>>> The second thing you will notice, eventually, is that rsync over ssh
>>> under Cygwin is unreliable.
>>
>> You mean *starting* an rsync operation on that side? Using rsync over ssh
>> essentially uses rsync on *both* ends. So, it's running under Cygwin,
>> anyway,
>> which makes your statement a bit confusing.
>
> What I mean is that if you launch rsync with something like:
>
> rsync -e ssh server:/path /path
>
> then rsync uses a non-blocking (I said blocking earlier, which was a
> mistake) socket pair to communicate with ssh.  This may trigger a bug in
> cygwin which can cause the application to hang.

It's been 7 years since I use rsync over ssh to backup windows boxes
though it worked pretty well for me back then.

One thing to try if rsync hangs on you is the --timeout option, which
should cause rsync to abort if no data is transferred within X seconds.

I wrote up a fairly big rsync log retrieval system that has about 90
systems uploading more than a TB of data a day to a NFS cluster, sometimes
the system is really busy, so rather then have rsync hang for a really
long period of time I just have it abort after 10 minutes of no
activity. Also put retry logic in the rsync scripts themselves, so
they attempt to send data up to 20 times per hour per system(new data
is made available to upload once an hour). Of course this is entirely
linux based, and I am using rsync over HPN-SSH with encryption disabled
for higher performance.

Sample rsync command line that I use:
rsync -ae "/usr/bin/hpnssh -v -o TcpRcvBufPoll=yes -o NoneEnabled=yes -o
NoneSwitch=yes" --timeout=600   --log-format="[%p] %t %o %f (%l/%b)"
--files-from=/home/logrsync/conf/rsync_log_file_list.20090501_090201
/local_dir/ 10.254.213.203:/remote/dir/
1>>/home/logrsync/logs/server_name_rsync_log_transfer_20090501_090201.log
2>&1

Just finished another rsync deployment system that downloads data to
those same servers, with built in parallelism for increased throughput
over the WAN.

I currently have 6 rsync/ssh systems that do the file serving which
are load balanced behind a BigIP. Main bottleneck is the cisco firewall
which can only do 1.2Gbps of throughput.

nate