Les Mikesell wrote:
On Tue, Nov 5, 2013 at 2:41 PM, m.roth@5-cent.us wrote:
<snip>
We have a *bunch* of d/bs. Oracle. MySQL. Postgresql. All with about a week's dumps from every night, and then backups of them to the b/u servers. I can't imagine how they'd be a win - don't remember just off the top of my head if they're compressed or not.
If the dumps aren't pre-compressed, they would be compressed on the backuppc side. And if there are unchanged copies on the target hosts
Right, but
(i.e. more than the current night's dumps) that would still be recognized by the rsync run as unchanged, even though backuppc is looking at the compressed copy. If you already compress on the target host, there's not much more you can do.
A *lot* of our data is not huge text files - lots and lots of pure datafiles, output from things like Matlab, R, and some local programs, like the one for modeling protein folding.
Anything that isn't already compressed, encrypted, or isn't strictly intentionally random is likely to compress 2 to 10x. Just poking through the 'compression summary' on my backuppc servers, I don't see anything less than 55% and most of the bigger targets are closer to 80% compression. One that has 50Gb of logfiles, is around 90%.
Oh, please - I see a filesystem fill up, and I start looking for what did it so suddenly... just the other week, I had one of our interns run Matlab and create a 35G nohup.out in his home directory... which was on the same filesystem mine was, and I was Not Amused when that blew out the filesystem.
Yeah, I know, we're trying to move stuff around, that's not infrequent, given the amount of data my folks generate.
mark