block level changes at the file system level?

List overview All Threads
Download

newer

older

How to enable sound for other...

Re: [CentOS] [CentOS-announce]...

Lists

2 Jul 2014 2 Jul '14

7:53 p.m.

I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks?

Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility...

-Ben

Show replies by date

m.roth＠5-cent.us

2 Jul 2 Jul

7:57 p.m.

Lists wrote:

...

I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks?

Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility...

I think the buzzword you want is dedup.

mark

Lists

3 Jul 3 Jul

7:01 p.m.

On 07/02/2014 12:57 PM, m.roth@5-cent.us wrote:

...

I think the buzzword you want is dedup.

dedup works at the file level. Here we're talking about files that are highly similar but not identical. I don't want to rewrite an entire file that's 99% identical to the new file form, I just want to write a small set of changes. I'd use ZFS to keep track of which blocks change over time.

I've been asking around, and it seems this capability doesn't exist *anywhere*.

-Ben

Jack Bailey

7:05 p.m.

...

...
I think the buzzword you want is dedup.

dedup works at the file level. Here we're talking about files that are highly similar but not identical. I don't want to rewrite an entire file that's 99% identical to the new file form, I just want to write a small set of changes. I'd use ZFS to keep track of which blocks change over time.

I've been asking around, and it seems this capability doesn't exist *anywhere*.

Check this link.

https://blogs.oracle.com/bonwick/entry/zfs_dedup

...

*What to dedup: Files, blocks, or bytes?*

Data can be deduplicated at the level of files, blocks, or bytes. ...

m.roth＠5-cent.us

7:06 p.m.

Lists wrote:

...

On 07/02/2014 12:57 PM, m.roth@5-cent.us wrote:

...
I think the buzzword you want is dedup.

dedup works at the file level. Here we're talking about files that are highly similar but not identical. I don't want to rewrite an entire file that's 99% identical to the new file form, I just want to write a small set of changes. I'd use ZFS to keep track of which blocks change over time.

I've been asking around, and it seems this capability doesn't exist *anywhere*.

I was under the impression from a few years ago that at least the then-commercial versions operated at the block level, *not* at the file level. rsync works at the file level, and dedup is supposed to be fancier.

mark

Les Mikesell

7:23 p.m.

On Thu, Jul 3, 2014 at 2:06 PM, m.roth@5-cent.us wrote:

...

Lists wrote:

...
On 07/02/2014 12:57 PM, m.roth@5-cent.us wrote:

...
I think the buzzword you want is dedup.

dedup works at the file level. Here we're talking about files that are highly similar but not identical. I don't want to rewrite an entire file that's 99% identical to the new file form, I just want to write a small set of changes. I'd use ZFS to keep track of which blocks change over time.

I've been asking around, and it seems this capability doesn't exist *anywhere*.

I was under the impression from a few years ago that at least the then-commercial versions operated at the block level, *not* at the file level. rsync works at the file level, and dedup is supposed to be fancier.

Yes, basically it would keep a table of hashes of the content of existing blocks and do something magic to map writes of new matching blocks to the existing copy at the file system level. Whether that turns out to be faster/better than something like rdiff-backup would be up to the implementations. Oh, and I forgot to mention that there is an alpha version of backuppc4 at http://sourceforge.net/projects/backuppc/files/backuppc-beta/4.0.0alpha3/ that is supposed to do deltas between runs.

But, since this is about postgresql, the right way is probably just to set up replication and let it send the changes itself instead of doing frequent dumps.

-- Les Mikesell lesmikesell@gmail.com

Les Mikesell

2 Jul 2 Jul

8:11 p.m.

On Wed, Jul 2, 2014 at 2:53 PM, Lists lists@benjamindsmith.com wrote:

...

I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks?

Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility...

There is something called rdiff-backup (http://www.nongnu.org/rdiff-backup/ and packaged in EPEL) that does reverse diffs at the application level. If it performs well enough it might be easier to manage than a de-duping filesystem. Or backuppc - which would store a complete copy if there are any changes at all between dumps but would compress them and automatically manage the number you need to keep.

-- Les Mikesell lesmikesell@gmail.com

John R Pierce

3 Jul 3 Jul

7:19 p.m.

On 7/2/2014 12:53 PM, Lists wrote:

...

I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks?

Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility...

you do realize, adding/removing or even changing the length of a single line in a block of that pg_dump file will change every block after it as the data will be offset ?

may I suggest that instead of pg_dump, you use pg_basebackup and WAL archiving... this is the best way to do delta backups of a sql database server.

-- john r pierce 37N 122W somewhere on the middle of the left coast

Rainer Duffner

7:26 p.m.

Am 03.07.2014 um 21:19 schrieb John R Pierce pierce@hogranch.com:

...

On 7/2/2014 12:53 PM, Lists wrote:

...
I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks?

Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility...

you do realize, adding/removing or even changing the length of a single line in a block of that pg_dump file will change every block after it as the data will be offset ?

may I suggest that instead of pg_dump, you use pg_basebackup and WAL archiving... this is the best way to do delta backups of a sql database server.

Additionally, I’d be extremely careful with ZFS dedup.

It uses much more memory than „normal“ ZFS and tends to consume more I/Os, too.

Lists

7:48 p.m.

On 07/03/2014 12:19 PM, John R Pierce wrote:

...

you do realize, adding/removing or even changing the length of a single line in a block of that pg_dump file will change every block after it as the data will be offset ?

Yes. And I guess this is probably where the conversation should end. I'm used to the capabilities of Mercurial DVCS as well as ZFS snapshots, and was thinking/hoping that this type of capability might exist in a file system. Perhaps it just doesn't belong there.

On 07/03/2014 12:23 PM, Les Mikesell wrote:

...

But, since this is about postgresql, the right way is probably just to set up replication and let it send the changes itself instead of doing frequent dumps.

Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

A) run PG on a ZFS partition and snapshot ZFS. B) Keep making dumps (as now) and use lots of disk space. C) Cook something new and magical using diff, rdiff-backup, or related tools.

-Ben

Les Mikesell

8:10 p.m.

On Thu, Jul 3, 2014 at 2:48 PM, Lists lists@benjamindsmith.com wrote:

...

On 07/03/2014 12:23 PM, Les Mikesell wrote:

...
But, since this is about postgresql, the right way is probably just to set up replication and let it send the changes itself instead of doing frequent dumps.

Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability.

I think it does. You should be able to have a base dump plus some number of incremental logs that you can apply to get to a point in time. Might take longer than loading a single dump, though.

Depending on your data, you might be able to export it as tables in sorted order for snapshots that would diff nicely, but it is painful to develop things that break with changes in the data schema.

...

So at the moment we're down to:

A) run PG on a ZFS partition and snapshot ZFS. B) Keep making dumps (as now) and use lots of disk space. C) Cook something new and magical using diff, rdiff-backup, or related tools.

Disk space is cheap - and pg_dumps usually compress pretty well. But if you have time to experiment, I'd like to know how rdiff-backup or backuppc4 performs.

-- Les Mikesell lesmikesell@gmail.com

Stephen Harris

8:47 p.m.

On Thu, Jul 03, 2014 at 12:48:34PM -0700, Lists wrote:

...

Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

You can recover WAL files up until the point in time specified in the restore file

See, for example

http://opensourcedbms.com/dbms/how-to-do-point-in-time-recovery-with-postgre...

#recovery_target_time = '' # e.g. '2004-07-14 22:39:00 EST'

-- rgds Stephen

Devin Reade

4 Jul 4 Jul

8:41 p.m.

--On Thursday, July 03, 2014 04:47:30 PM -0400 Stephen Harris lists@spuddy.org wrote:

...

On Thu, Jul 03, 2014 at 12:48:34PM -0700, Lists wrote:

...
Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

You can recover WAL files up until the point in time specified in the restore file

See, for example

http://opensourcedbms.com/dbms/how-to-do-point-in-time-recovery-with-post gresql-9-2-pitr-3/

I have to back up Stephen on this one:

1. The most efficient way to get minimal diffs is generally to get the program that understands the semantics of the data to make the diffs. In the DB world this is typically some type of baseline + log shipping. It comes in various flavours and names, but the concept is the same across the various enterprise-grade databases.

Just as algorithmic changes to an application to increase performance are always going to be much better than trying to tune OS-level parameters, doing "dedup" at the application level (where the capability exists) is always going to be more efficient than trying to do it at the OS level.

2. Recreating a point-in-time image for audits, testing, etc, then becomes the process of exercising your recovery/DR procedures (which is a very good side effect). Want to do an audit? Recover the db by starting with the baseline and rolling the log forward to the desired point.

3. Although rolling the log forward can take time, you can find a suitable tradeoff between recover time and disk space by periodically taking a new baseline (weekly? monthly? depends on your write load) Then anything older than that baseline is only of interest for audit data/retention purposes, and no longer factors into the recovery/DR/test scenarios.

4. Using baseline + log shipping generally results in smaller storage requirements for offline / offsite backups. (Don't forget that you're not exercising your DR procedure unless you sometimes recover from your offsite backups, so maybe it would be good to have a policy that all audits are performed based on recovery from offsite media, only.)

5. With the above mechanisms in place, there's basically zero need for block- or file-based deduplication, so you can save yourself from that level of complexity / resource usage. You may decide that filesystem-level snapshots of the filesystem holding the log files still plays a part in your backup strategy, but that's separate from the dedup issue.

Echoing one of John's comments, I would be very surprised if doing dedup on database-type data would realize any significant benefits for common configurations/loads.

Devin

Ljubomir Ljubojevic

3 Jul 3 Jul

8:50 p.m.

On 07/03/2014 09:48 PM, Lists wrote:

...

On 07/03/2014 12:19 PM, John R Pierce wrote:

...
you do realize, adding/removing or even changing the length of a single line in a block of that pg_dump file will change every block after it as the data will be offset ?

Yes. And I guess this is probably where the conversation should end. I'm used to the capabilities of Mercurial DVCS as well as ZFS snapshots, and was thinking/hoping that this type of capability might exist in a file system. Perhaps it just doesn't belong there.

On 07/03/2014 12:23 PM, Les Mikesell wrote:

...
But, since this is about postgresql, the right way is probably just to set up replication and let it send the changes itself instead of doing frequent dumps.

Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

A) run PG on a ZFS partition and snapshot ZFS. B) Keep making dumps (as now) and use lots of disk space. C) Cook something new and magical using diff, rdiff-backup, or related tools.

Check out 7z from p7zip package. I use command:

7za a -t7z $YearNum-$MonthNum.7z -i@include.lst -mx$CompressionMetod -mmt$ThreadNumber -mtc=on

for compressing a lot of similar files from sysinfo and kernel.log, files backuped every hour that do not change much. And I find out that it reuses already existing blocks/hashes/whatever and I guess just reference them with a pointer instead of storing them again.

So, 742 files that uncompressed have 179 MB, compressed ocupy only 452 KB, which is only 0.2% of original size, 442 TIMES smaller :

Listing archive: 2014-03.7z

-- Path = 2014-03.7z Type = 7z Method = LZMA Solid = + Blocks = 1 Physical Size = 426210 Headers Size = 9231

Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ 2014-03-01 00:02:07 ....A 259517 416979 Silos-Srbobran-l5-kernlog-2014-03-01-00-02-07.txt 2014-03-01 01:01:52 ....A 259529 Silos-Srbobran-l5-kernlog-2014-03-01-01-01-52.txt ............................................... ............................................... ............................................... 2014-03-31 22:01:33 ....A 259502 Silos-Srbobran-l5-kernlog-2014-03-31-22-01-33.txt 2014-03-31 23:01:33 ....A 259485 Silos-Srbobran-l5-kernlog-2014-03-31-23-01-32.txt ------------------- ----- ------------ ------------ ------------------------ 184553028 416979 742 files, 0 folders

Maybe, if you compress them into the same file (afterwords?) you can get the similar result.

-- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe StarOS, Mikrotik and CentOS/RHEL/Linux consultant

lee

4 Jul 4 Jul

8:51 p.m.

Ljubomir Ljubojevic centos@plnet.rs writes:

...

7za a -t7z $YearNum-$MonthNum.7z -i@include.lst -mx$CompressionMetod -mmt$ThreadNumber -mtc=on

So, 742 files that uncompressed have 179 MB, compressed ocupy only 452 KB, which is only 0.2% of original size, 442 TIMES smaller :

Perhaps there is a file system that supports compression and would do a good job with the snapshots transparently. Maybe even ZFS or btrfs do?

-- Knowledge is volatile and fluid. Software is power.

SilverTip257

7 Jul 7 Jul

12:35 p.m.

On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

...

...
Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

A) run PG on a ZFS partition and snapshot ZFS. B) Keep making dumps (as now) and use lots of disk space. C) Cook something new and magical using diff, rdiff-backup, or related tools.

Check out 7z from p7zip package. I use command:

7za a -t7z $YearNum-$MonthNum.7z -i@include.lst -mx$CompressionMetod -mmt$ThreadNumber -mtc=on

Seems to be that 7zip uses LZMA for the 7z archives (though it supports other compression types). Confirmed below.

...

for compressing a lot of similar files from sysinfo and kernel.log, files backuped every hour that do not change much. And I find out that it reuses already existing blocks/hashes/whatever and I guess just reference them with a pointer instead of storing them again.

So, 742 files that uncompressed have 179 MB, compressed ocupy only 452 KB, which is only 0.2% of original size, 442 TIMES smaller :

Listing archive: 2014-03.7z

-- Path = 2014-03.7z Type = 7z Method = LZMA

Exactly ... LZMA ... Grab the "xz" package for CentOS 6

There's also an --lzma option for tar.

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

-- ---~~.~~--- Mike // SilverTip257 //

Ljubomir Ljubojevic

12:53 p.m.

On 07/07/2014 02:35 PM, SilverTip257 wrote:

...

On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

...
...
Whatever we do, we need the ability to create a point-in-time history. We commonly use our archival dumps for audit, testing, and debugging purposes. I don't think PG + WAL provides this type of capability. So at the moment we're down to:

A) run PG on a ZFS partition and snapshot ZFS. B) Keep making dumps (as now) and use lots of disk space. C) Cook something new and magical using diff, rdiff-backup, or related tools.

Check out 7z from p7zip package. I use command:

7za a -t7z $YearNum-$MonthNum.7z -i@include.lst -mx$CompressionMetod -mmt$ThreadNumber -mtc=on

Seems to be that 7zip uses LZMA for the 7z archives (though it supports other compression types). Confirmed below.

...
for compressing a lot of similar files from sysinfo and kernel.log, files backuped every hour that do not change much. And I find out that it reuses already existing blocks/hashes/whatever and I guess just reference them with a pointer instead of storing them again.

So, 742 files that uncompressed have 179 MB, compressed ocupy only 452 KB, which is only 0.2% of original size, 442 TIMES smaller :

Listing archive: 2014-03.7z

-- Path = 2014-03.7z Type = 7z Method = LZMA

Exactly ... LZMA ... Grab the "xz" package for CentOS 6

There's also an --lzma option for tar.

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

xz needs to be checked for support of block optimization/reuse, I do not think that is part of LZMA protokol (I might be wrong of course). Also, check needs to be made if xz supports multitrheading like pk7zip.

Btw., p7zip package is in EPEL, should be safe enough,but I understand your concerns.

-- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe StarOS, Mikrotik and CentOS/RHEL/Linux consultant

Markus Falb

8:54 p.m.

On 07.Jul.2014, at 14:53, Ljubomir Ljubojevic centos@plnet.rs wrote:

...

On 07/07/2014 02:35 PM, SilverTip257 wrote:

...
On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

check needs to be made if xz supports multitrheading like pk7zip.

No, it think it does not. There is a threads option but in the manpage is stated

... Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. ...

-- Markus

Ljubomir Ljubojevic

9:56 p.m.

On 07/07/2014 10:54 PM, Markus Falb wrote:

...

On 07.Jul.2014, at 14:53, Ljubomir Ljubojevic centos@plnet.rs wrote:

...
On 07/07/2014 02:35 PM, SilverTip257 wrote:

...
On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

check needs to be made if xz supports multitrheading like pk7zip.

No, it think it does not. There is a threads option but in the manpage is stated

... Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. ...

OH, that's a bummer. Thanks, now i do not have to waste time on experimenting.

-- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe StarOS, Mikrotik and CentOS/RHEL/Linux consultant

Fred Smith

10:48 p.m.

On Mon, Jul 07, 2014 at 11:56:08PM +0200, Ljubomir Ljubojevic wrote:

...

On 07/07/2014 10:54 PM, Markus Falb wrote:

...
On 07.Jul.2014, at 14:53, Ljubomir Ljubojevic centos@plnet.rs wrote:

...
On 07/07/2014 02:35 PM, SilverTip257 wrote:

...
On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

check needs to be made if xz supports multitrheading like pk7zip.

No, it think it does not. There is a threads option but in the manpage is stated

... Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. ...

OH, that's a bummer. Thanks, now i do not have to waste time on experimenting.

Have you seen pigz? similar to gzip but multithreaded.

-- ---- Fred Smith -- fredex@fcshome.stoneham.ma.us ----------------------------- I can do all things through Christ who strengthens me. ------------------------------ Philippians 4:13 -------------------------------

Ljubomir Ljubojevic

11:06 p.m.

On 07/08/2014 12:48 AM, Fred Smith wrote:

...

On Mon, Jul 07, 2014 at 11:56:08PM +0200, Ljubomir Ljubojevic wrote:

...
On 07/07/2014 10:54 PM, Markus Falb wrote:

...
On 07.Jul.2014, at 14:53, Ljubomir Ljubojevic centos@plnet.rs wrote:

...
On 07/07/2014 02:35 PM, SilverTip257 wrote:

...
On Thu, Jul 3, 2014 at 4:50 PM, Ljubomir Ljubojevic centos@plnet.rs wrote:

I am inclined to use xz utils as opposed to 7zip since 7zip comes from a 3rd party repo.

check needs to be made if xz supports multitrheading like pk7zip.

No, it think it does not. There is a threads option but in the manpage is stated

... Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. ...

OH, that's a bummer. Thanks, now i do not have to waste time on experimenting.

Have you seen pigz? similar to gzip but multithreaded.

"will compress files in place, adding the suffix '.gz'." OP and I want to save a lot of space while compressing large number of almost similar files. p7zip does it for me:

On 07/03/2014 10:50 PM, Ljubomir Ljubojevic wrote:

...

So, 742 files that uncompressed have 179 MB, compressed ocupy only 452 KB, which is only 0.2% of original size, 442 TIMES smaller

-- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe StarOS, Mikrotik and CentOS/RHEL/Linux consultant

4062

Age (days ago)

4067

Last active (days ago)

discuss@lists.centos.org

20 comments

13 participants

tags (0)

participants (13)

Devin Reade
Fred Smith
Jack Bailey
John R Pierce
lee
Les Mikesell
Lists
Ljubomir Ljubojevic
m.roth＠5-cent.us
Markus Falb
Rainer Duffner
SilverTip257
Stephen Harris