[CentOS] Rsync and differential Backups

Fri Nov 13 09:46:46 UTC 2015
J Martin Rushton <martinrushton56 at btinternet.com>

Hash: SHA1

On 13/11/15 01:52, Benjamin Smith wrote:
> I did exactly this with ZFS on Linux and cut over 24 hours of
> backup lag to just minutes.
> If you're managing data at scale, ZFS just rocks...
> On Tuesday, November 10, 2015 01:16:28 PM Warren Young wrote:
>> On Nov 10, 2015, at 8:46 AM, Gordon Messmer
>> <gordon.messmer at gmail.com>
> wrote:
>>> On 11/09/2015 09:22 PM, Arun Khan wrote:
>>>> You can use "newer" options of the find command and pass the
>>>> file list
>>> the process you described is likely to miss files that are
>>> modified while "find" runs.
>> Well, be fair, rsync can also miss files if files are changing
>> while the backup occurs.  Once rsync has passed through a given
>> section of the tree, it will not see any subsequent changes.
>> If you need guaranteed-complete filesystem-level snapshots, you
>> need to be using something at the kernel level that can
>> atomically collect the set of modified blocks/files, rather than
>> something that crawls the tree in user space.
>> On the BSD Now podcast, they recently told a war story about
>> moving one of the main FreeBSD servers to a new data center.
>> rsync was taking 21 hours in back-to-back runs purely due to the
>> amount of files on that server, which gave plenty of time for
>> files to change since the last run.
>> Solution?  ZFS send:
>> http://128bitstudios.com/2010/07/23/fun-with-zfs-send-and-receive/
>> CentOS mailing list CentOS at centos.org 
>> https://lists.centos.org/mailman/listinfo/centos
> _______________________________________________ CentOS mailing
> list CentOS at centos.org 
> https://lists.centos.org/mailman/listinfo/centos
If you really _need_ the guarantee of a snapshot, consider either LVM
or RAID1. Break out a volume from the RAID set, back it up, then
rebuild. If you are paranoid you might want to consider a 3-way RAID1
to ensure you have full shadowing during the backup.  Some commercial
filesystems (such as IBM's GPFS) also include a snapshot command, but
you may need deep pockets.

Other than that, accept as harmless the fact that your backup takes a
finite time.  Provided that you record the time before starting the
sweep, and do the next incremental from that time, then you will catch
all files eventually.  The time lag shouldn't be much though, decent
backup systems scan the sources and generate a work list before
starting to move data.

OT - is ZFS part of the CentOS distro?  I did a quick yum list | grep
- -i zfs and got nothing on a 7.1.1503.
Version: GnuPG v2.0.22 (GNU/Linux)