On 12/18/2013, 04:00 , lists@benjamindsmith.com wrote:
I may be being presumptuous, and if so, I apologize in advance...
It sounds to me like you might consider a disk-to-disk backup solution. I could suggest dirvish, BackupPC, or our own home-rolled rsync-based solution that works rather well:http://www.effortlessis.com/backupbuddy/
Note that with these solutions you get multiple save points that are deduplicated with hardlinks so you can (usually) keep dozens of save points in perhaps 2x the disk space of a single copy. Also, because of this, you can go back a few days / weeks / whatever when somebody deletes a file. In our case, we make the backed up directories available via read-only ftp so that end users can recover their files.
I don't know if dirvish offers this, but backupbuddy also allows you to run pre and post backup shell scripts, which we use (for example) for off-site archiving to permanent storage since backup save points expire.
-Ben
Not presumptuous at all! I have not heard of backupbuddy (or dirvish), so I should investigate. Your description makes it sound somewhat like OS-X Time Machine, which I like a lot. I did try backuppc but it got a bit complex to manage IMHO.
Thanks for the tip!
Chuck
On Wed, Dec 18, 2013 at 9:13 AM, Chuck Munro chuckm@seafoam.net wrote:
Not presumptuous at all! I have not heard of backupbuddy (or dirvish), so I should investigate. Your description makes it sound somewhat like OS-X Time Machine, which I like a lot. I did try backuppc but it got a bit complex to manage IMHO.
I've always considered backuppc to be one of those rare things that you set up once and it takes care of itself for years. If you have problems with it, someone on the backuppc mail list might be able to help. It does tend to be slower than native rsync and especially bad at handling huge directories, but sometimes you can split up large filesystems into smaller subdirectory runs and if necessary you can use the ClientAlias feature to make it look like a single large host is several different systems so you can skew the full and incremental runs of different areas to different days.
On 12/18/2013 07:50 AM, Les Mikesell wrote:
I've always considered backuppc to be one of those rare things that you set up once and it takes care of itself for years. If you have problems with it, someone on the backuppc mail list might be able to help. It does tend to be slower than native rsync and especially bad at handling huge directories, but sometimes you can split up large filesystems into smaller subdirectory runs and if necessary you can use the ClientAlias feature to make it look like a single large host is several different systems so you can skew the full and incremental runs of different areas to different days.
BackupPC is a great product, and if I knew of it and/or it was available when I started, I would likely have used it instead of cutting code. Now that we've got BackupBuddy working and integrated, we aren't going to be switching as it has worked wonderfully for a decade with very few issues and little oversight.
I would differentiate BackupBuddy in that there is no "incremental" and "full" distinction. All backups are "full" in the truest sense of the word, and all backups are stored as native files on the backup server. This works using rsync's hard-link option to minimize wasted disk space. This means that the recovery process is just copying the files you need. Also, graceful recovery for downtime and optimistic disk space usage are both very nice. (it will try to keep as many backup savepoints as it can disk space depending)
I'm evaluating ZFS and will likely include some features of ZFS into BBuddy as we integrate these capabilities into our backup processes. We're free to do this in part because we have redundant backup sets, so a single failure wouldn't be catastrophic in the short/medium term.
-Ben
On Wed, Dec 18, 2013 at 3:13 PM, Lists lists@benjamindsmith.com wrote:
I would differentiate BackupBuddy in that there is no "incremental" and "full" distinction. All backups are "full" in the truest sense of the word,
For the people who don't know, backuppc builds a directory tree for each backup run where the full runs are complete and the incrementals normally only contain the changed files. However, when you access the incremental backups through the web interface or the command line tools, the backing full is automatically merged so you don't have to deal with the difference - and when using rsync as the xfer method, deletions are tracked correctly. As far as the rsync-based xfer goes, the difference between a full and incremental run is that the fulls add the --ignore-times option to force a full block checksum compare of the file data, while the incrementals quickly skip files where the diretory timestamp and length match.
and all backups are stored as native files on the backup server. This works using rsync's hard-link option to minimize wasted disk space.
Backuppc normally compresses the files for even more disk savings, and it hard-links all files with identical content with its hash based pooling mechanism. This works across targets, not just for the unchanged files in a single run so it is great where you have copies of the same files on many hosts.
I'm evaluating ZFS and will likely include some features of ZFS into BBuddy as we integrate these capabilities into our backup processes. We're free to do this in part because we have redundant backup sets, so a single failure wouldn't be catastrophic in the short/medium term.
By the way, there is a new version of backuppc (4.0) in alpha testing that does not use hardlinks for the pooling plus some other changes that will help make it easier to rsync the whole archive to an offsite mirror. I haven't tried it myself yet and am not sure off the top of my head if it chunks up large files for better pooling of the unchanged portions.
On 12/18/2013 03:04 PM, Les Mikesell wrote:
For the people who don't know, backuppc builds a directory tree for each backup run where the full runs are complete and the incrementals normally only contain the changed files. However, when you access the incremental backups through the web interface or the command line tools, the backing full is automatically merged so you don't have to deal with the difference - and when using rsync as the xfer method, deletions are tracked correctly.
Should I read this as "BackupPC now has its own filesystem driver"? If so, wow. Or do you mean that there are command line tools to read/copy BackupPC save points?
On 12/18/2013 3:41 PM, Lists wrote:
Should I read this as "BackupPC now has its own filesystem driver"? If so, wow. Or do you mean that there are command line tools to read/copy BackupPC save points?
web interface, primarily. you can restore any portion of any version of any backup to the original system, or download it as a .zip or .tar file.
On Wed, Dec 18, 2013 at 5:41 PM, Lists lists@benjamindsmith.com wrote:
On 12/18/2013 03:04 PM, Les Mikesell wrote:
For the people who don't know, backuppc builds a directory tree for each backup run where the full runs are complete and the incrementals normally only contain the changed files. However, when you access the incremental backups through the web interface or the command line tools, the backing full is automatically merged so you don't have to deal with the difference - and when using rsync as the xfer method, deletions are tracked correctly.
Should I read this as "BackupPC now has its own filesystem driver"? If so, wow. Or do you mean that there are command line tools to read/copy BackupPC save points?
No it is all application level stuff that really only needs hardlinks to work correctly and atomically on the underlying filesystem. It does, however, have its own rsync implementation in perl that knows how to work with compressed files on the server side while chatting with a stock rsync at the other end. There is a web interface to browse/restore (or download single files or tar/zip images) and command line tools to extract single files or tar images. I think someone did do a read-only fuse filesystem on top of it though, so you could do things like grep/diff/rsync directly - but it is not part of the standard system.