[CentOS] Rsync and differential Backups

Sat Nov 14 11:04:29 UTC 2015
J Martin Rushton <martinrushton56 at btinternet.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Have a coffee or a beer, breathe deeply, then:

On 14/11/15 00:42, Gordon Messmer wrote:
> On 11/13/2015 12:59 PM, J Martin Rushton wrote:
>> Maybe I should have been clearer: use (LVM) OR (RAID1 and
>> break).
> 
> I took your meaning.  I'm saying that's a terrible backup strategy,
> for a list of reasons.
> 
> For instance, it only works if you mirror a single disk.  It
> doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc.

That of course is exactly why I said RAID1.

  Breaking RAID
> doesn't make the data consistent, so you might have corrupt files 
> (especially if the system runs any kind of database.  SQL, LDAP,
> etc). It doesn't make the filesystem consistent, so you might have
> a corrupt filesystem.

Possibly, but that is another problem altogether.  Any low level
backup will do the same.  You need to have an understanding of the
filesystem to handle filesystem problems.  Even if the utility
understands the filesystem you have problems with open files such as
databases.

More generally, for anything except a trivial database you should use
the database to dump itself; for instance using mysqldump.  Have a
look at the page
https://mariadb.com/kb/en/mariadb/backup-and-restore-overview/ for (as
it says) an overview.  Try running a database backup timed to complete
before your normal filesystem backups run, whatever method you use.

> 
> Even if you ignore the potential for corruption, you have a backup 
> process that only works on some specific hardware configurations. 
> Everything else has to have a different backup solution.  That's 
> insane.  Use one backup process that works for everything.  You're
> much more likely to consistently back up your data that way.

Remember that this is a last resort if (1) the user can't accept more
sensible backups and handle (or let the backup handle) the dates
safely; (2) the user insists on a snapshot; (3) the user can't use a
filesytem snapshot (ZFS, GPFS etc) and (4) the user can't/won't use
LVM.  You can't refuse to use better solutions" and then complain that
last resort is not as good as the better solutions"!

>> I hope I'm wrong, but you wouldn't be thinking of mounting the
>> broken out copy on a the same system would you?  You must never
>> do that, not even during disaster recovery.  Use dd or similar on
>> the disk, not the mounted partitions - isn't that obvious?  I
>> wasn't trying to give step by step instructions.
> 
> Well, that's *one* of the problems with your advice.  Even if we
> ignore the fact that it doesn't work reliably (and IMO, it
> therefore doesn't work), it's far more complicated than you pretend
> it is.
> 
> Because now you're talking about quiescing your services, breaking
> your RAID, physically removing the drive, connecting it to another
> system, fsck the filesystems, mount them, and backing up the data.
> For each backup.  Every day.

No need to remove if you handle whole disk.  When we used this
technique we only did it monthly - it would be pretty crazy to do
level 0 backups daily.

> Or using 'dd' and... backing up the whole image?  No incremental
> or differentials?

See the previous.

> Your process involves a human being doing physical tasks as part of
> the backup.  Maybe I'm the only one, but I want my backups fully
> automated. People make mistakes.  I don't want them involved in
> regular processes. In fact, the entire point of computing is that
> the computer should do the work so that I don't have to.

See the comments about using better solutions.  I'd be worried though
if you use a solution that doesn't remove the backup media from the
vicinity of the machine.  Fine if you have a remote site, but
otherwise you still need a person to physically take the tapes (or
whatever) out of the machine room to fireproof storage.  That's pretty
manual.

>> Way before LVM existed we used this technique to back up VAXes
>> (and later Alphas) under VMS using "volume shadowing" (ie RAID1).
>> It worked quite happily for several years with disks shared
>> across the cluster. IIRC it was actually recommended by DEC,
>> indeed a selling point, but I don't have any manuals to hand to
>> confirm that nowadays! One thing I did omit was you MUST sync
>> first
> 
> sync flushes the OS data buffers to disk, but it does not sync 
> application data buffers, it does not flush the journal, it doesn't
> make filesystems "clean", and even if you break the RAID volume
> immediately after "sync" there's no guarantee that there weren't
> cached writes from other processes in between those two steps.

The journal is a fair point if it is stored on an separate spindle, as
for instance is possible under XFS.

> There is absolutely no way to make this a reliable process without
> a full shutdown.

Not IME.  At that date the preferred method for monthly backups was a
shutdown and standalone utility for disk-disk copies, but that was not
always possible.  The technique worked.

> 
> _______________________________________________ CentOS mailing
> list CentOS at centos.org 
> https://lists.centos.org/mailman/listinfo/centos

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWRxU9AAoJEAF3yXsqtyBldK8P+gMocnEFL0d5ciFhl/QUj50V
z1GU4zMOhJeVZgS+KW2WM48/YYd9XdTX82G3352UEbnwOd7OmWkt3JhQ5QsmeZRP
F1AwmetHCt0+RtQli9uAPywGvPtnc7ROJPEznZa97YJU4G56/8sEqxA26On5G2h9
uCNUG69dyI4yhAH/liW76iJWRZt6TJQVKaHeMXUX9lqdTACZ64WCWAS+dJACmMiA
mrOYFUbey5EBRHcqlXYX4Az3O/9btD2++bTdqqMJ3BN8Q7NF3pbfrxVvqeghR8mV
kqkFs5W7kk4xbJS+yMgbMnPkE4LpCxgIDBpKg/7pLqYVBjs91TqzSXWWVGluAdM4
5I4mI5lbvqA+OZjV5sIfKhyv+SfrQJm0Y6+FXZjPq1ul9xVbi9DWYMVEJGeRc+Gj
bbU83nnK7L01i6yEANP6UIN07BKfciAwrDHy6VZBJsQn4cM2ce0YGgKMlobfgp1D
XFuR2RncDzcgpVEhz9r4nsc9vVt3WRQrk4KcxP1AA5VjMR6YD8wS47Ssox1nNnx8
T85DCupNZsXIlUp7AqWiSZTLYx9O9Ulkdhpt2uUx4/aC0GIUdNnyGEpcyHGlkI3K
FAYXSYF5nEukpU5km0iX67vAcJe9EjfiuEIwp0w25YdNIQYzOI/HnuPgSTEsO1au
J9hexRZOa30aSACVye8S
=r9jt
-----END PGP SIGNATURE-----