[CentOS] Remote backup of server

Wed Sep 9 12:19:59 UTC 2009
Michael Kress <kress at hal.saar.de>

Hi, you're searching for a solution that makes snapshots with hardlinks
1) use rsync --delete over ssh
2) use cp -al to create generations
3) rotate the generations daily, just with mv

The generations use nearly no additional disk space, only changes in the
file system consume space (i.e. additions), because of the usage of
hardlinks for the rest of the files. With new files, rsync will overwrite
the hardlink in the current generation of your backup. The hardlinks of
the older generations stay intact, thus the older physical file stays
intact. Remember, a file stays "alive" as long as there's at least one
hardlink pointing to it. This mechanism is the relief to your worries - if
a data corruption occurs on one of your files then your backups from the
past 'n' days will contain a file version that is still good, whereas 'n'
is the number of generations you'll keep.

An example:
day #1:
=======
* first rsync happens, lots of files will be created
daily.0/abc   (hardlink to file abc with inode 2235)
daily.0/def   (hardlink to file def with inode 2249)
daily.0/ghi   (hardlink to file ghi with inode 3456)

day #2:
=======
* do a 'cp -al daily.0 daily.1'
* do the new rsync on daily.0, modified file "abc" coming over
* the hardlink daily.1/abc stays untouched (so is the file)
* the hardlink faily.0/abc is a new one as the file is a new one
daily.0/abc   (NOTE: hardlink to file abc with inode 8877 ! )
daily.0/def   (hardlink to file def with inode 2249)
daily.0/ghi   (hardlink to file ghi with inode 3456)
daily.1/abc   (hardlink to file abc with inode 2235)
daily.1/def   (hardlink to file def with inode 2249)
daily.1/ghi   (hardlink to file ghi with inode 3456)

Each of the files def and ghi consume only once the disk space, whereas
abc from daily.0 and abc from daily.1 are different files with different
inodes and they of course consume the double amount of disk space.

You may secure your ssh connection even more by not using root i.e. by
using a non privileged user. In that case you'd have to use a sudo etry
(via 'visudo') allowing the non privileged user to use /usr/bin/rsync as
the super user, i.e. on EVERY file in the system. The sudo line would be:
backupuser ALL=(root)NOPASSWD:/usr/bin/rsync

Of course you should only mount the partition you're backing up to as read
write when you have to. Otherwise it should stay unmounted or at least
mounted read only. The machine you're backing up to should be a single
user machine. A user id 501 on machine A, named 'john', may be a different
user on machine B, there named 'bill'. So if bill logs into machine B (and
if he has user id 501) then he'll be able to see the files from user
'john', in case the backup partition is readable. (That's also why you
should keep it unmounted). Data in backups may e.g. contain mysql
passwords, smtp passwords, etc.

That's not THE ULTIMATE solution, but it works for me and it seems to be
quite efficient. I think the main advantage of that solution is that
you're independent of any backup software except for cp and rsync.

Contact me in case you've got further questions.
Michael


happymaster23 wrote:
> Thank you for reply,
>
> because rsync is only synchronizing data (with all errors), this is
> not backup. If on main server will be some data corruption and backup
> server will connect and synchronize all data with errors, I have
> nothing :).
>
> For example - rdiff-backup is working with increments, so you can
> restore data a year back...
>
> 2009/9/4 Johnny Hughes <johnny at centos.org>:
>> On 09/04/2009 11:23 AM, happymaster23 wrote:
>>> I want mount directory of one server to another over internet. I was
>>> looking to NFS4, but there are no security mechanisms. I need
>>> encrypted connection using private key (something like SFTP).
>>>
>>> Or - if there is in CentOS repo (or EPEL) package, that can mount
>>> directory over internet using private key and make differential backup
>>> (like rdiff-backup).
>>>
>>> Thank you very much for links or other resources work up
>>
>> Why not just use rsync over ssh?