We are presently looking into alternative backup strategies for our networked servers and are considering Bacula. Does anyone have any opinions on this application, good and bad, to share? Further, is there a CentOS4 specific rpm build available for this in a yum repository (I note that CentOS4 tags have been added to the Bacula source tree)?
Regards, Jim
-- *** e-mail is not a secure channel *** mailto:byrnejb.<token>@harte-lyne.ca James B. Byrne Harte & Lyne Limited vox: +1 905 561 1241 9 Brockley Drive fax: +1 905 561 0757 Hamilton, Ontario <token> = hal Canada L8E 3C3
On Tue, 2 Aug 2005 at 10:25am, James B. Byrne wrote
We are presently looking into alternative backup strategies for our networked servers and are considering Bacula. Does anyone have any opinions on this application, good and bad, to share? Further, is there a CentOS4 specific rpm build available for this in a yum repository (I note that CentOS4 tags have been added to the Bacula source tree)?
I'm a long time amanda user, so I may be a bit biased. I looked into bacula a month or so ago for 2 reasons -- 1) tape spanning support (which amanda has only in experimental patches, and 2) native ACL support (amanda uses native tools like tar or dump to actually get the bits off the disk, so ACL support is up to them). I decided against bacula pretty quickly, though, because the scheduling facilities of it are, well, non-existent. You have to make all the scheduling decisions yourself. If you're backing up a small-moderate amount of data, that's OK. I backup 4.5TB of formatted space, which just expanded to 10TB. I don't want to decide, for each backup list item, when to do a full and when to do an incremental. Amanda does all that for me, and and does a very good job of it.
My $0.02 worth, YMMV, etc., etc..
On Tue, Aug 02, 2005 at 10:38:09AM -0400, Joshua Baker-LePain wrote:
On Tue, 2 Aug 2005 at 10:25am, James B. Byrne wrote
We are presently looking into alternative backup strategies for our networked servers and are considering Bacula. Does anyone have any opinions on this application, good and bad, to share? Further, is there a CentOS4 specific rpm build available for this in a yum repository (I note that CentOS4 tags have been added to the Bacula source tree)?
I'm a long time amanda user, so I may be a bit biased. I looked into bacula a month or so ago for 2 reasons -- 1) tape spanning support (which amanda has only in experimental patches, and 2) native ACL support (amanda uses native tools like tar or dump to actually get the bits off the disk, so ACL support is up to them). I decided against bacula pretty quickly, though, because the scheduling facilities of it are, well, non-existent. You have to make all the scheduling decisions yourself.
I've also used amanda in the past, and looked into bacula for the tape spanning support as well. However, I was not turned off by having to setup the schedules manually, and have been using bacula for several months to backup ~15TB. Aside from the tape spanning support (which I think is maturing in amanda), I've found having the catalog in a true database to be a great feature, particularly when a user inevitably request files (the names of which they only vaguely remember) be restored.
Cheers, Bryan Cardillo Penn Bioinformatics Core University of Pennsylvania
On Tue, 2 Aug 2005 at 11:50am, Bryan Cardillo wrote
On Tue, Aug 02, 2005 at 10:38:09AM -0400, Joshua Baker-LePain wrote:
I'm a long time amanda user, so I may be a bit biased. I looked into bacula a month or so ago for 2 reasons -- 1) tape spanning support (which amanda has only in experimental patches, and 2) native ACL support (amanda uses native tools like tar or dump to actually get the bits off the disk, so ACL support is up to them). I decided against bacula pretty quickly, though, because the scheduling facilities of it are, well, non-existent. You have to make all the scheduling decisions yourself.
I've also used amanda in the past, and looked into bacula for the tape spanning support as well. However, I was not turned off by having to setup the schedules manually, and have been using bacula for several months to backup ~15TB. Aside from the tape spanning support (which I think is maturing in amanda), I've found having the catalog in a true database to be a great feature, particularly when a user inevitably request files (the names of which they only vaguely remember) be restored.
So, how do you set the schedule? And when do your backups run (what's the window)?
On 8/2/05 8:54 AM, Joshua Baker-LePain wrote:
So, how do you set the schedule? And when do your backups run (what's the window)?
Define a schedule, e.g.,
Schedule { Name = GFS Run = Level=Full Pool=monthly-tape 1st saturday at 21:30 Run = Level=Full Pool=weekly-tape 2nd-5th saturday at 21:30 Run = Level=Incremental Pool=daily-tape sun-fri at 23:30 }
Then tell your job to use it:
Job { ... Schedule = GFS ... }
On Tue, 2 Aug 2005 at 8:59am, Paul Heinlein wrote
On 8/2/05 8:54 AM, Joshua Baker-LePain wrote:
So, how do you set the schedule? And when do your backups run (what's the window)?
Define a schedule, e.g.,
Schedule { Name = GFS Run = Level=Full Pool=monthly-tape 1st saturday at 21:30 Run = Level=Full Pool=weekly-tape 2nd-5th saturday at 21:30 Run = Level=Incremental Pool=daily-tape sun-fri at 23:30 }
Then tell your job to use it:
Job { ... Schedule = GFS ... }
So how long do your backups typically run on Sun-Fri, and on Sat? How much do you backup on those runs?
On Tue, Aug 02, 2005 at 11:54:04AM -0400, Joshua Baker-LePain wrote:
On Tue, 2 Aug 2005 at 11:50am, Bryan Cardillo wrote
I've also used amanda in the past, and looked into bacula for the tape spanning support as well. However, I was not turned off by having to setup the schedules manually, and have been using bacula for several months to backup ~15TB.
So, how do you set the schedule? And when do your backups run (what's the window)?
monthly fulls, first weekend of the month, start early saturday am, sometimes run until monday am. weekly diffs and daily incrementals run in an 8 hour nightly window I should also mention, we're using disk spooling, an lto3 drive (soon to be two), hardware compression, and a dedicated gig network.
--Bryan
On 8/2/05 7:25 AM, James B. Byrne wrote:
We are presently looking into alternative backup strategies for our networked servers and are considering Bacula. Does anyone have any opinions on this application, good and bad, to share? Further, is there a CentOS4 specific rpm build available for this in a yum repository (I note that CentOS4 tags have been added to the Bacula source tree)?
Our main backup server runs bacula on CentOS 3. Storage is handled by a Dell PowerVault 122T with a single LTO-1 drive. Obviously, we don't handle a huge amount of data. :-)
In general, I like bacula. It works and plays well with our library, automatically swapping tapes as necessary. We currently back up a mix of Linux and Mac OS X hosts with no trouble. Unlike other backup programs I've used, you can specify file inclusions and exclusions with regular expressions, which can be helpful.
At the same time, there's no doubt it feels sort of rough hewn. I don't know if it was just our situation or not, but I ended up writing my configuration file from scratch, discarding nearly all of the sample configurations. Once in place, however, the config files are easy to manage with tools like cfengine or rsync.
As far as downsides go, there's no native wire-level security. Backups will cross your network in plain text unless you hack in some stunnel support. Make sure your backup server has plenty of RAM; having a swath of local disk for spooling won't hurt either.
I should note that, to date, we've relied completely on the command-line tools; I don't have any experience at all with the GUI frontends.
Quoting Paul Heinlein heinlein@madboa.com:
Our main backup server runs bacula on CentOS 3. Storage is handled by a Dell PowerVault 122T with a single LTO-1 drive. Obviously, we don't handle a huge amount of data. :-)
I'm also looking (well, planning to look) into Bacula as an alternative to Amanda. Couple of questions.
Can I have my backups go to the disk, instead of using tapes? Can I have some on disk, and some on tapes? On-site disk, off-site tapes combination?
Another question is about restores. How easy/automated are they? Amanda has a database of what file system is on what tape (and where on the tape). And that's about it. It has no idea of what files are actually contained inthere, what versions of the file, or where inside the dump/tar are the files. This makes restores a bit tedious and slow job, since 90% of time only single file (or set of files) is needed from backup, not entire file system.
I'm looking to something more like Veritas and Legato solution where backup server keeps tabs on the actuall files (what version, on which tape, and location on the tape). Is it easy to browse for particular file that I need to restore, and then simply have Bacula do the rest (load correct tape, rewind it to where the file is stored and extract it)?
The ability to tell backup system "I want server.mycorp.com's /etc/foobar.conf from two months ago restored" and then simply sit back and relax while backup server is doing all the job automatically in the background is priceless. Is Bacula able to do that?
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Tue, Aug 02, 2005 at 10:56:54AM -0500, Aleksandar Milivojevic enlightened us:
Another question is about restores. How easy/automated are they? Amanda has a database of what file system is on what tape (and where on the tape). And that's about it. It has no idea of what files are actually contained inthere, what versions of the file, or where inside the dump/tar are the files. This makes restores a bit tedious and slow job, since 90% of time only single file (or set of files) is needed from backup, not entire file system.
Apparently you've never looked at the amrecover program...
I'm looking to something more like Veritas and Legato solution where backup server keeps tabs on the actuall files (what version, on which tape, and location on the tape). Is it easy to browse for particular file that I need to restore, and then simply have Bacula do the rest (load correct tape, rewind it to where the file is stored and extract it)?
The ability to tell backup system "I want server.mycorp.com's /etc/foobar.conf from two months ago restored" and then simply sit back and relax while backup server is doing all the job automatically in the background is priceless. Is Bacula able to do that?
# amrecover -C Dailies
sethost server.mycorp.com setdate 2005-06-02 setdisk / cd etc add foobar.conf extract
Sit back and relax :-)
Of course, you have to have indexing enabled, etc, but amanda can most certainly do it.
Matt
Quoting Matt Hyclak hyclak@math.ohiou.edu:
Apparently you've never looked at the amrecover program...
Apperently I haven't ;-)
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On 8/2/05 8:56 AM, Aleksandar Milivojevic wrote:
I'm also looking (well, planning to look) into Bacula as an alternative to Amanda. Couple of questions.
Can I have my backups go to the disk, instead of using tapes? Can I have some on disk, and some on tapes? On-site disk, off-site tapes combination?
Yes to all of the above.
Another question is about restores. How easy/automated are they?
If you've ever used the restore(8) shell, then you'll feel right at home. Essentially, you get a command-line view of the backed up file system. You can navigate with cd, ls, find, etc. You choose the files and/or directory trees to restore, specify where they should be restored, make sure the correct tapes are accessible, and voila!
If you've got multiple copies of a file, you can choose which version you want restored.
The ability to tell backup system "I want server.mycorp.com's /etc/foobar.conf from two months ago restored" and then simply sit back and relax while backup server is doing all the job automatically in the background is priceless. Is Bacula able to do that?
Pretty much, though the trick is in the telling. I haven't investigated whether it's possible to script that sort of thing; I've only used the restore shell.
On Tue, 2005-08-02 at 10:56, Aleksandar Milivojevic wrote:
I'm also looking (well, planning to look) into Bacula as an alternative to Amanda. Couple of questions.
Can I have my backups go to the disk, instead of using tapes? Can I have some on disk, and some on tapes? On-site disk, off-site tapes combination?
If you are only using disk - or you can leave amanda running the tapes but plan to only restore from them in a disaster situation, consider backuppc (http://backuppc.sourceforge.net/). It is basically 'full auto' on the backup side - even more than amanda since you don't even have to swap tapes. It uses an interesting scheme of compression and hardlinks to greatly reduce the storage space you need, and it gives you a web-browsable interface to select files to restore or download through the browser.
The ability to tell backup system "I want server.mycorp.com's /etc/foobar.conf from two months ago restored" and then simply sit back and relax while backup server is doing all the job automatically in the background is priceless. Is Bacula able to do that?
Within the range of what you can keep online (which turns out to be about 8x more than you could without the compression/links), backuppc will do that.
Aleksandar Milivojevic alex@milivojevic.org wrote:
Can I have my backups go to the disk, instead of using tapes? Can I have some on disk, and some on tapes? On-site disk, off-site tapes combination?
Paul Heinlein heinlein@madboa.com wrote:
Yes to all of the above.
Bryan Cardillo dillo+centos@seas.upenn.edu wrote:
monthly fulls, first weekend of the month, start early saturday am, sometimes run until monday am. weekly diffs and daily incrementals run in an 8 hour nightly window I should also mention, we're using disk spooling, an lto3 drive (soon to be two), hardware compression, and a dedicated gig network.
It's coincidental that the "disk and tape" as well as "backup window" concepts were discussed today, because Sys Admin just came out with its new issue on storage: http://www.samag.com/
I've been spoiled by the FalconStor VTL (Virtual Tape Library) solution.
Now I don't expect Freedomware solutions to offer the same level of "transparent tape virtualization," but I would be interested in any Open Source framework that uses a combination of near-line disk and off-line tape approaches.
E.g., 1) nodes only rsync diffs over network to the "host" backup server 2) the "host" backup server manages those diffs/volumes as near-line 3) the "host" backup server committs to tape for off-line as the discretion of the sys/netadmin
Any Freedomware solution that makes this easy to manage is definitely of great interest to me. I reguarly go to clients who believe they "have to" not only commit to tape, but do it in real-time (end-system to end-tape) and only during their backup windows (instead of sending diffs over the window, and then letting a backup server commit as scheduled).
On Tue, 2005-08-02 at 17:02, Bryan J. Smith wrote:
Now I don't expect Freedomware solutions to offer the same level of "transparent tape virtualization," but I would be interested in any Open Source framework that uses a combination of near-line disk and off-line tape approaches.
E.g.,
- nodes only rsync diffs over network to the "host" backup
server
Backuppc is probably the only thing that can do this even though the backup-server side only stores compressed files. It has a custom implementation of rsync to manage this while talking to standard versions on the target machines. It can also use tar or smbtar for the transport.
- the "host" backup server manages those diffs/volumes as
near-line
It provides web access to browse/restore/download files.
- the "host" backup server committs to tape for off-line as
the discretion of the sys/netadmin
Backuppc allows you to write an 'archive' of a host manually, either to tape or as a compressed tar split in files sized to fit CDs or DVDs. This is sort of an afterthought though. I just periodically mirror the disk to an external drive to keep copies of the whole thing offsite. Since it keeps about 8x the data on disk with it's compression and linking scheme it is really easier to deal with it on disks.