[CentOS] Rsync and differential Backups

Sat Nov 14 20:02:34 UTC 2015
Gordon Messmer <gordon.messmer at gmail.com>

On 11/14/2015 03:04 AM, J Martin Rushton wrote:
> On 14/11/15 00:42, Gordon Messmer wrote:
>> For instance, it only works if you mirror a single disk.  It
>> doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc.
> That of course is exactly why I said RAID1.

I know.  And I was trying to make the point that the process of breaking 
RAID1 for backup purposes is inflexible in addition to being 
unreliable.  Users should not have to re-engineer their backup system 
for every hardware configuration.

>    Breaking RAID
>> doesn't make the data consistent, so you might have corrupt files
>> (especially if the system runs any kind of database.  SQL, LDAP,
>> etc). It doesn't make the filesystem consistent, so you might have
>> a corrupt filesystem.
> Possibly, but that is another problem altogether.  Any low level
> backup will do the same.

If you were to attempt a block-level backup of the raw device, then yes, 
you would have similar problems.  But since that is insane, and no one 
is suggesting that process, I didn't feel the need to address it.

>    You need to have an understanding of the
> filesystem to handle filesystem problems.  Even if the utility
> understands the filesystem you have problems with open files such as
> databases.

There *are* tools that exist to dump filesystems, but they're not 
intended to be used for backup, and they won't operate on mounted 
filesystems.  For instance, clonezilla includes tools to dump ext4 and 
ntfs filesystems for the purpose of cloning a system.  You could treat 
that as a backup, but you have to shut down the host OS to boot clonezilla.

> More generally, for anything except a trivial database you should use
> the database to dump itself; for instance using mysqldump.

Uhh.... no.  I'd argue the opposite.  You should only use a DB dump 
tools for trivial databases (or in some cases, such as PostgreSQL, 
upgrades).  Dumping a database is *slow*.  The only thing slower than 
dumping a database is restoring a database dump.  If you have a 
non-trivial database, you definitely want to quiesce, snapshot, resume, 
and back up the snapshot.

> Have a
> look at the page
> https://mariadb.com/kb/en/mariadb/backup-and-restore-overview/ for (as
> it says) an overview.  Try running a database backup timed to complete
> before your normal filesystem backups run, whatever method you use.

Again, you seem entirely too willing to accept unreliable processes.  
Timing?  You should absolutely, under no circumstances, trust the timing 
of two processes to not overlap.  If you're dumping data, you should 
either trigger the backup from the dump job, after it completes, or you 
should employ a locking system so that only one of the two processes can 
operate simultaneously.

> Remember that this is a last resort if (1) the user can't accept more
> sensible backups and handle (or let the backup handle) the dates
> safely; (2) the user insists on a snapshot; (3) the user can't use a
> filesytem snapshot (ZFS, GPFS etc) and (4) the user can't/won't use
> LVM.  You can't refuse to use better solutions" and then complain that
> last resort is not as good as the better solutions"!

No one is refusing better solutions.  You are tilting at windmills.

> See the comments about using better solutions.  I'd be worried though
> if you use a solution that doesn't remove the backup media from the
> vicinity of the machine.  Fine if you have a remote site

We agree, there.  You should have backups in a physically separate location.