On Thursday, June 09, 2016 05:18:03 PM Alessandro Baggi wrote:
Thank you for your reply and sorry for late.
My needs is only get a copy of large dataset a make sure that it is not broken after transfer. After transfer, this data will be stored on local backup server where there is bacula installation.
For file transfer, to save time and bandwidth I will use rsync but I don't know how to check if those file will be corrupted.
How I can perform this check? I can make an md5 for each file but for a great number of file this can be a problem.
Is there any chance you could switch to running ZFS or maybe BTRFS? They are ridiculously more efficient at sending reliable, incremental updates to a file system over a low bandwidth link and the load of rsync "at scale" can be enormous.
In our case, with about half a billion files, doing rsync over a local Gb LAN took well over 24 hours - simply doing the discovery stage of rsync was nearly all the overhead due to IOPs limitations.
Switching our primary backup method to using ZFS and send/receive of incremental snapshots cut the time to backup/replicate to under 30 minutes, with no significant change in server load.
And don't let the name "incremental snapshots" fool you - the end result is identical to doing a full backup / copy, with all files being verified as binary perfect as of the moment the snapshot was made.
Really, if you can do this and you care about your data, you want to do this, even if you don't know it yet. The learning curve is significant but the results are well worth it.