[CentOS-devel] The status of our Vagrant images for Hyper-V

Thu Apr 6 13:56:45 UTC 2017
Laurentiu Pancescu <lpancescu at gmail.com>

Hi Niels,

On 06/04/17 14:57, Niels de Vos wrote:
> Maybe you can post-process the large .vhd file with "virt-sparsify"? I
> do not know if it would recognize the format, but it is worth to try it.
> Otherwise it should be possible to "punch holes" in the large .vhd file
> with a little program that calls fallocate(2) on zero-filled areas.

The .vhd files that qemu-img creates are already sparse, their real size 
is 1.1GB (seen by du, ls reports 40GB); the problem occurs during their 
addition to a .tar.gz file, because Python's tarfile module treats them 
as regular files while reading them.  The resulting archive is just 
480MB, since gzip can handle lots of adjacent zeros quite well.  This 
archive has to be extracted by Vagrant on Windows, so even if the file 
would still be sparse inside the archive, we depend on Ruby for Windows 
to handle them properly, and Hyper-V as well.  I'll have to wait until 
at least Monday, when Michael can hopefully test if a locally-generated 
sparse file works properly on Windows.

Perhaps there are no issues with sparse files and Hyper-V, but we're the 
only ones generating such files.  The .vhdx images exported by Hyper-V 
are regular files (everyone else uses Packer's Hyper-V plugin), as are 
the .vhd files produced by VirtualBox - both the theoretical and the 
real size is 1.1GB.  Only qemu-img produces huge sparse .vhd files, 
although it produces regular 1.1GB .vmdk files (used by the VirtualBox 
variant of our Vagrant boxes).  Even worse, the .vhdx files produced by 
qemu-img from EL7 are huge non-sparse 41GB files - this bug was 
allegedly fixed upstream around January 2015, but maybe the fix wasn't 
backported yet.

Using VirtualBox for the .vhd conversion would be the least likely to 
generate surprises, but maybe using GNU tar to create the .ova archive 
for Vagrant (instead of Python's tarfile) will be enough.

Best regards,
Laurențiu