[CentOS] remote backup

Thu Jun 16 17:50:31 UTC 2016
Benjamin Smith <lists at benjamindsmith.com>

On Thursday, June 09, 2016 05:18:03 PM Alessandro Baggi wrote:
> Thank you for your reply and sorry for late.
> 
> My needs is only get a copy of large dataset a make sure that it is not
> broken after transfer. After transfer, this data will be stored on local
> backup server where there is bacula installation.
> 
> For file transfer, to save time and bandwidth I will use rsync but I
> don't know how to check if those file will be corrupted.
> 
> How I can perform this check?
> I can make an md5 for each file but for a great number of file this can
> be a problem.

Is there any chance you could switch to running ZFS or maybe BTRFS? They are 
ridiculously more efficient at sending reliable, incremental updates to a file 
system over a low bandwidth link and the load of rsync "at scale" can be 
enormous. 

In our case, with about half a billion files, doing rsync over a local Gb LAN 
took well over 24 hours - simply doing the discovery stage of rsync was nearly 
all the overhead due to IOPs limitations. 

Switching our primary backup method to using ZFS and send/receive of 
incremental snapshots cut the time to backup/replicate to under 30 minutes, 
with no significant change in server load. 

And don't let the name "incremental snapshots" fool you - the end result is 
identical to doing a full backup / copy, with all files being verified as binary 
perfect as of the moment the snapshot was made. 

Really, if you can do this and you care about your data, you want to do this, 
even if you don't know it yet. The learning curve is significant but the 
results are well worth it.