[CentOS] VDO killed my server

Mon Sep 3 18:40:52 UTC 2018
david <david at daku.org>


I was impressed with the description of VDO (Virtual Device 
Optimizer?) in the RedHat documentaion, so much that I tried to use 
it.  The tutorials led me to a few commands.  I built a VDO device on 
top of two USB disks which I made into a Logical Volume, and I was ready to go.

In my test case, I had a file set of about 600 GB.  There was 5 TB of 
space between the two disk LVMs.  So, I thought, let's see if I can 
activate deduplication and compression, and see if VDO can take two, 
or three, or four identical copies of that file set, at different 
points in the file system tree.

Needless to say, all worked well with the first set.  It took 24 
hours to copy.  The second set took another 24 hours, and all seemed 
well.  As I was copying the third set, I started to observe some 
problems.  The computer was serving other functions (internal DHCPD, 
DNS, internal HTTPD), and these started to fail.  There were no 
obvious alerts or warnings from VDO, but the other functions of the 
system started to die.  The diagnostics from JOURNALCTL were vague 
(failure to create a file...), but when I want looking with 'df', all 
the file systems seemed to have enough room for everything.  Even the 
'top' program showed available space in the pools it revealed.

After hours of my internal clients complaining, I finally removed the 
'mount' in /etc/fstab that loaded the VDO system, killed the file 
copies, and rebooted.  The system then resumed normal healthy 
functions, but without the VDO files.

It my mind, there are a few points:

- If VDO is competing for a finite resource (Memory?), it probably 
should start posting warnings, and eventually rejecting new files 
when the pool is nearly full.  Or maybe, use a pool other than what 
the other services use so as to minimize the impact on them.
- The documentation talks about 'tuning', but if this resource is one 
of concern, please don't bury it in the footnotes to the appendix.
- Using VDO on top of LVM seems to be the logical way to use 
deduplication for a large file set, yet the use cases don't seem to 
cover this (unless I misread them)

I have reverted back reluctantly to using ZFS for this function.

Have others had issues with VDO?

David Kurn
Linux amateur