[CentOS] C8 and backup solution

Fri Apr 3 08:44:03 UTC 2020
Alessandro Baggi <alessandro.baggi at gmail.com>

Il 02/04/20 21:14, Karl Vogel ha scritto:
> [Replying privately because my messages aren't making it to the list]
>
>>> In a previous message, Alessandro Baggi said:
> A> Bacula works without any problem, well tested, solid but complex to
> A> configure. Tested on a single server (with volumes on disk) and a
> A> full backup of 810gb (~150000 files) took 6,30 hours (too much).
>
> For a full backup, I'd use something like "scp -rp". Anything else
> has overhead you don't need for the first copy.
>
> Also, pick a good cipher (-c) for the ssh/scp commands -- it can improve
> your speed by an order of magnitude. Here's an example where I copy
> my current directory to /tmp/bkup on my backup server:
>
> Running on: Linux x86_64
> Thu Apr 2 14:48:45 2020
>
> me% scp -rp -c aes128-gcm at openssh.com -i $HOME/.ssh/bkuphost_ecdsa \
> . bkuphost:/tmp/bkup
>
> Authenticated to remote-host ([remote-ip]:22).
> ansible-intro 100% 16KB 11.3MB/s 00:00 ETA
> nextgov.xml 100% 27KB 21.9MB/s 00:00 ETA
> building-VM-images 100% 1087 1.6MB/s 00:00 ETA
> sort-array-of-hashes 100% 1660 2.5MB/s 00:00 ETA
> ...
> ex1 100% 910 1.9MB/s 00:00 ETA
> sitemap.m4 100% 1241 2.3MB/s 00:00 ETA
> contents 100% 3585 5.5MB/s 00:00 ETA
> ini2site 100% 489 926.1KB/s 00:00 ETA
> mkcontents 100% 1485 2.2MB/s 00:00 ETA
>
> Transferred: sent 6465548, received 11724 bytes, in 0.4 seconds
> Bytes per second: sent 18002613.2, received 32644.2
>
> Thu Apr 02 14:48:54 2020
>
> A> scripted rsync. Simple, through ssh protocol and private key. No agent
> A> required on target. I use file level deduplication using hardlinks.
>
> I avoid block-level deduplication as a general rule -- ZFS memory
> use goes through the roof if you turn that on.
>
> rsync can do the hardlinks, but for me it's been much faster to create
> a list of SHA1 hashes and use a perl script to link the duplicates.
> I can send you the script if you're interested.
>
> This way, you're not relying on the network for anything other than the
> copies; everything else takes place on the local or backup system.
>
> A> Using a scripted rsync is the simpler way but there is something that
> A> could be leaved out by me (or undiscovered error). Simple to restore.
>
> I've never had a problem with rsync, and I've used it to back up Linux
> workstations with ~600Gb or so. One caveat -- if you give it a really
> big directory tree, it can get lost in the weeds. You might want to do
> something like this:
>
> 1. Make your original backup using scp.
>
> 2. Get a complete list of file hashes on your production systems
> using SHA1 or whatever you like.
>
> 3. Whenever you do a backup, get a (smaller) list of modified files
> using something like "find ./something -newer /some/timestamp/file"
> or just making a new list of file hashes and comparing that to the
> original list.
>
> 4. Pass the list of modified files to rsync using the "--files-from"
> option so it doesn't have to walk the entire tree again.
>
> Good luck!
>
> --
> Karl Vogel / vogelke at pobox.com / I don't speak for the USAF or my company
>
> The best setup is having a wife and a mistress. Each of them will assume
> you're with the other, leaving you free to get some work done.
> --programmer with serious work-life balance issues

Hi Karl,

thank you for your answer. I'm trying ssh scripted rsync using a faster 
cypher like you suggested and seems that transfer on 10GB is better of 
default selected cypher (129 sec vs 116 using aes128-gcm, I tested this 
multiple times). Now I will try to check on the entire dataset and see 
how much benefit I gain.

Waiting that, what do you think about bacula as backup solution?

Thank you in advance.