[CentOS] Deduplication data for CentOS?

Tue Aug 28 08:14:16 UTC 2012

Am 27.08.2012 22:55, schrieb Adam Tauno Williams:
> On Mon, 2012-08-27 at 14:32 -0400, Brian Mathis wrote:
>> On Mon, Aug 27, 2012 at 7:55 AM, Rainer Traut <tr.ml at gmx.de> wrote:
>>> We have looked into lessfs, sdfs and ddar.
>>> Are these filesystems ready to use (on centos)?
>>> ddar is sthg different, I know.
>> This is something I have been thinking about peripherally for a while
>> now.  What are your impressions of SDFS (OpenDedupe)?  I had been
>> hoping it would be pretty good.  Any issues with it on CentOS?
>
> I've used it for backups; it works reliably.  It is memory hungry
> however [sort of the nature of block-level deduplication].
> <http://www.wmmi.net/documents/OpenDedup.pdf>

I have read the pdf and one thing strikes me:
--io-chunk-size <SIZE in kB; use 4 for VMDKs, defaults to 128>

and later:
● Memory
● 2GB allocation OK for:
● 200GB at 4KB chunks
● 6TB at 128KB chunks
...
32TB of data at 128KB requires
8GB of RAM. 1TB @ 4KB equals
the same 8GB.

We are using ESXi5 in a SAN environment, right now with a 2TB backup volume.
You are right, 16GB of ram is still much...
And why 4k chunk size for VMDKs?