[CentOS] Understanding VDO vs ZFS

Sun May 3 05:33:33 UTC 2020
Erick Perez - Quadrian Enterprises <eperez at quadrianweb.com>

sorry corrections:
For this test I created a 40GB lvm volume group with /dev/sdb and /dev/sdc
then a 40GB LV
then a 60GB VDO vol (for testing purposes)

vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
output from just created vdoas

[root at localhost ~]# vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving
physical blocks                     : 10483712
  logical blocks                      : 15728640
  1K-blocks                           : 41934848
  1K-blocks used                      : 4212024
  1K-blocks available                 : 37722824
  used percent                        : 10
  saving percent                      : 99
[root at localhost ~]#

FIRST copy CentOS-7-x86_64-Minimal-2003.iso (1.1G) to vdoas from source
outside vdo volume
[root at localhost ~]# vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving
  1K-blocks used                      : 4721348
  1K-blocks available                 : 37213500
  used percent                        : 11
  saving percent                      : 9

SECOND copy  CentOS-7-x86_64-Minimal-2003.iso (1.1G) to vdoas form source
outside vdo volume
#cp /root/CentOS-7-x86_64-Minimal-2003.iso
  1K-blocks used                      : 5239012
  1K-blocks available                 : 36695836
  used percent                        : 12
  saving percent                      : 52

THIRD  copy  CentOS-7-x86_64-Minimal-2003.iso (1.1G) to
vdoas form inside vdo volume to inside vdo volume
  1K-blocks used                      : 5248060
  1K-blocks available                 : 36686788
  used percent                        : 12
  saving percent                      : 67

Then I did this a total of 9 more times to have 10 ISOs copied. Total data
copied 10.6GB.

Do note this:
When using DF, it will show the VDO size, in my case 60G
when using vdostats it will show the size of the LV, in my case 40G
Remeber dedupe AND compression are enabled.

The df -hT output shows the logical space occupied by these iso files as
seen by the filesystem on the VDO volume.
Since VDO manages a logical to physical block map, df sees logical space
consumed according to the file system that resides on top of the VDO
vdostats --hu is viewing the physical block device as managed by VDO.
Physically a single .ISO image is residing on the disk, but logically the
file system thinks there are 10 copies, occupying 10.6GB.

So at the end I have 10 .ISOs of 1086 1MB blocks (total 10860 1MB blocks)
that yield these results:
  1K-blocks used                      : 5248212
  1K-blocks available                 : 36686636
  used percent                        : 12
  saving percent                      : 89

So at the end it is using 5248212 1K blocks minus  4212024  initial used 1K
blocks, gives (5248212 - 4212024) = 1036188 1K blocks / 1024 = about 1012MB

Hope this helps understanding where the space goes.

BTW: Testing system is CentOS Linux release 7.8.2003 stock. with only "yum
install vdo kmod-kvdo"

History of commands:
[root at localhost vdomounts]# history
    2  pvcreate /dev/sdb
    3  pvcreate /dev/sdc
    8  vgcreate -v -A y vgvol01 /dev/sdb /dev/sdc
    9  vgdisplay
   13  lvcreate -l 100%FREE -n lvvdo01 vgvol01
   14   yum install vdo kmod-kvdo
   18  vdo create --name=vdoas --device=/dev/vgvol01/lvvdo01
--vdoLogicalSize=60G --writePolicy=async
   19  mkfs.xfs -K /dev/mapper/vdoas
   20  ls /mnt
   21  mkdir /mnt/vdomounts
   22  mount /dev/mapper/vdoas /mnt//vdomounts/
   26  vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
   28  cp /root/CentOS-7-x86_64-Minimal-2003.iso /mnt/vdomounts/ -vvv
   29  vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
   30  cp /root/CentOS-7-x86_64-Minimal-2003.iso
   31  vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
   33  cd /mnt/vdomounts/
   35  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   36  vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
   37  df
   39  vdostats --hu
   40  ls -l --block-size=1MB /root/CentOS-7-x86_64-Minimal-2003.iso
   41  df -hT
   42  vdo status | grep Dedupl
   43  vdostats --hu
   44  vdostats
   48  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   49  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   50  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   51  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   52  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   53  cp CentOS-7-x86_64-Minimal-2003-version2.iso
   54  df -hT
   55  ls -l --block-size=1MB
   56  vdostats --hu
   57  df -hT
   58  df
   59  vdostats --hu
   60  vdostats
   61  vdostats --verbose /dev/mapper/vdoas | grep -B6 'saving percent'
   62  cat /etc/centos-release
   63  history
[root at localhost vdomounts]#

> My two cents:
> 1- Do you have an encrypted filesystem on top of VDO? If yes, you will see
> no benefit from dedupe.
> 2- can you post the stats of  vdostats –verbose /dev/mapper/xxxxx (replace
> with your device)
> you can do something like:  "vdostats -verbose /dev/mapper/xxxxxxxx | grep
> -B6 'save percentage'
> On Sat, May 2, 2020 at 9:54 PM david <david at daku.org> wrote:
>> Folks
>> I'm looking for a solution for backups because ZFS has failed on me
>> too many times.  In my environment, I have a large amount of data
>> (around 2tb) that I periodically back up.  I keep the last 5
>> "snapshots".  I use rsync so that when I overwrite the oldest backup,
>> most of the data is already there and the backup completes quickly,
>> because only a small number of files have actually changed.
>> Because of this low change rate, I have used ZFS with its
>> deduplication feature to store the data.  I started using a Centos-6
>> installation, and upgraded years ago to Centos7.  Centos 8 is on my
>> agenda.  However, I've had several data-loss events with ZFS where
>> because of a combination of errors and/or mistakes, the entire store
>> was lost.  I've also noticed that ZFS is maintained separately from
>> Centos.  At this moment, the Centos 8 update causes ZFS to
>> fail.  Looking for an alternate, I'm trying VDO.
>> In the VDO installation, I created a logical volume containing two
>> hard-drives, and defined VDO on top of that logical volume.  It
>> appears to be running, yet I find the deduplication numbers don't
>> pass the smell test.  I would expect that if the logical volume
>> contains three copies of essentially identical data, I should see
>> deduplication numbers close to 3.00, but instead I'm seeing numbers
>> like 1.15.  I compute the compression number as follows:
>>   Use df and extract the value for "1k blocks used" from the third column
>>   use vdostats --verbose and extract the number titled "1K-blocks used"
>> Divide the first by the second.
>> Can you provide any advice on my use of ZFS or VDO without telling me
>> that I should be doing backups differently?
>> Thanks
>> David
