[CentOS] KVM vs. incremental remote backups

Wed Mar 31 19:35:31 UTC 2021
Gionatan Danti <g.danti at assyoma.it>

Il 2021-03-31 14:41 Nicolas Kovacs ha scritto:
> Hi,
> 
> Up until recently I've hosted all my stuff (web & mail) on a handful of 
> bare
> metal servers. Web applications (WordPress, OwnCloud, Dolibarr, GEPI,
> Roundcube) as well as mail and a few other things were hosted mostly on 
> one big
> machine.
> 
> Backups for this setup were done using Rsnapshot, a nifty utility that 
> combines
> Rsync over SSH and hard links to make incremental backups.
> 
> This approach has become problematic, for several reasons. First, web
> applications have increasingly specific and sometimes mutually 
> exclusive
> requirements. And second, last month I had a server crash, and even 
> though I
> had backups for everything, this meant quite some offline time.
> 
> So I've opted to go for KVM-based solutions, with everything split up 
> over a
> series of KVM guests. I wrapped my head around KVM, played around with 
> it (a
> lot) and now I'm more or less ready to go.
> 
> One detail is nagging me though: backups.
> 
> Let's say I have one VM that handles only DNS (base installation + 
> BIND) and
> one other VM that handles mail (base installation + Postfix + Dovecot).
> 
> Under the hood that's two QCOW2 images stored in 
> /var/lib/libvirt/images.
> 
> With the old "bare metal" approach I could perform remote backups using 
> Rsync,
> so only the difference between two backups would get transferred over 
> the
> network. Now with KVM images it looks like every day I have to transfer 
> the
> whole image again. As soon as some images have lots of data on them 
> (say, 100
> GB for a small OwnCloud server), this quickly becomes unmanageable.
> 
> I googled around quite some time for "KVM backup best practices" and 
> was a bit
> puzzled to find many folks asking the same question and no real answer, 
> at
> least not without having to jump through burning loops.
> 
> Any suggestions ?
> 
> Niki

Hi Nicolas,
the simpler approach would be to use a filesystem which natively 
supports send/recv on another host.

You can be tempted to use btrfs, but having tested it I strongly advice 
against it: it will horribly fragments and performance will be bad even 
if disabling CoW (which, by the way, is automatically re-enabled by 
snapshots).

I currently just use ZFS on Linux and it works very well. However, using 
it in CentOS is not trouble-free and it has its own CLI and specific 
issues to be aware; so, I understand if you don't want to go down this 
rabbit hole.

The next best thing I can suggest is to use lvmthin and XFS, with 
efficient block-level copies done to another host via tools as bdsync 
[1] or blocksync [2] (of which I forked an advanced version). On the 
receiving host, you should (again) use lvmthin and XFS with periodic 
snapshots.

Finally, I would leave the current rsnapshot backups in-place: you will 
simply copy from a virtual machine rather than from a bare metal host. I 
found rsnapshot really useful and reliable, so I suggest to continue 
using it even if efficient block-level backup are taken.

Just my 2 cents.
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8