On 3/31/21 12:50 PM, Nicolas Kovacs wrote:
The problem with using Rsnapshot on the VM's filesystems rather than backing up the whole VM is the time it takes to restore all the mess.
All the same, backing up the VM filesystem from within the VM is the best way to back them up using rsnapshot.
rsnapshot's approach of hard links and rsync necessarily means that each time a file changes, the copy in the backup set consumes the entire file size if any byte in the origin file has changed. If you're backing up VM images, you're giving up all of the efficiency that rsnapshot was designed for.
I'd note that your original message said that you were transferring the entire VM image. That *shouldn't* be the case. rsync should be transferring only the changed bits over the network, but on disk you'll have an entirely new file.
There are a few ways you can work around that with rsnapshot, but I'm not aware of an easy solution.
One option would be to use btrfs as your backup volume and write wrapper scripts for cmd_cp and cmd_rm. Rather than the default behavior, you'd want to create a snapshot (for cmd_cp) and remove snapshots (for cmd_rm).
The other option that comes to mind would be to use either XFS or btrfs as your backup volume and write a wrapper script for cmd_cp. This would be simpler, the script would just be:
#!/bin/sh exec cp --reflink=always "$@"
If you pursued either option, you'd want to modify the rsnapshot rsync_long_args setting, and add --inplace.
Those two approaches would take advantage of CoW filesystem capabilities to conserve disk space. If you decide to pursue them, bear in mind that "du" will report that each of the resulting VM images are full size, even though that's not really the case. The only way (that I know of) to accurately measure disk use will be to run "df" before a backup and after, and compare the disk use of the filesystem.