Il 02/04/20 21:14, Karl Vogel ha scritto: > [Replying privately because my messages aren't making it to the list] > >>> In a previous message, Alessandro Baggi said: > A> Bacula works without any problem, well tested, solid but complex to > A> configure. Tested on a single server (with volumes on disk) and a > A> full backup of 810gb (~150000 files) took 6,30 hours (too much). > > For a full backup, I'd use something like "scp -rp". Anything else > has overhead you don't need for the first copy. > > Also, pick a good cipher (-c) for the ssh/scp commands -- it can improve > your speed by an order of magnitude. Here's an example where I copy > my current directory to /tmp/bkup on my backup server: > > Running on: Linux x86_64 > Thu Apr 2 14:48:45 2020 > > me% scp -rp -c aes128-gcm at openssh.com -i $HOME/.ssh/bkuphost_ecdsa \ > . bkuphost:/tmp/bkup > > Authenticated to remote-host ([remote-ip]:22). > ansible-intro 100% 16KB 11.3MB/s 00:00 ETA > nextgov.xml 100% 27KB 21.9MB/s 00:00 ETA > building-VM-images 100% 1087 1.6MB/s 00:00 ETA > sort-array-of-hashes 100% 1660 2.5MB/s 00:00 ETA > ... > ex1 100% 910 1.9MB/s 00:00 ETA > sitemap.m4 100% 1241 2.3MB/s 00:00 ETA > contents 100% 3585 5.5MB/s 00:00 ETA > ini2site 100% 489 926.1KB/s 00:00 ETA > mkcontents 100% 1485 2.2MB/s 00:00 ETA > > Transferred: sent 6465548, received 11724 bytes, in 0.4 seconds > Bytes per second: sent 18002613.2, received 32644.2 > > Thu Apr 02 14:48:54 2020 > > A> scripted rsync. Simple, through ssh protocol and private key. No agent > A> required on target. I use file level deduplication using hardlinks. > > I avoid block-level deduplication as a general rule -- ZFS memory > use goes through the roof if you turn that on. > > rsync can do the hardlinks, but for me it's been much faster to create > a list of SHA1 hashes and use a perl script to link the duplicates. > I can send you the script if you're interested. > > This way, you're not relying on the network for anything other than the > copies; everything else takes place on the local or backup system. > > A> Using a scripted rsync is the simpler way but there is something that > A> could be leaved out by me (or undiscovered error). Simple to restore. > > I've never had a problem with rsync, and I've used it to back up Linux > workstations with ~600Gb or so. One caveat -- if you give it a really > big directory tree, it can get lost in the weeds. You might want to do > something like this: > > 1. Make your original backup using scp. > > 2. Get a complete list of file hashes on your production systems > using SHA1 or whatever you like. > > 3. Whenever you do a backup, get a (smaller) list of modified files > using something like "find ./something -newer /some/timestamp/file" > or just making a new list of file hashes and comparing that to the > original list. > > 4. Pass the list of modified files to rsync using the "--files-from" > option so it doesn't have to walk the entire tree again. > > Good luck! > > -- > Karl Vogel / vogelke at pobox.com / I don't speak for the USAF or my company > > The best setup is having a wife and a mistress. Each of them will assume > you're with the other, leaving you free to get some work done. > --programmer with serious work-life balance issues Hi Karl, thank you for your answer. I'm trying ssh scripted rsync using a faster cypher like you suggested and seems that transfer on 10GB is better of default selected cypher (129 sec vs 116 using aes128-gcm, I tested this multiple times). Now I will try to check on the entire dataset and see how much benefit I gain. Waiting that, what do you think about bacula as backup solution? Thank you in advance.