Chris Adams linux at cmadams.net Wed Jun 24 19:06:19 UTC 2015
Btrfs may eventually obsolete a lot of uses of LVM, but that's down the road.
LVM is the emacs of storage. It'll be here forever.
Btrfs doesn't export (virtual) block devices like LVM can, so it can't be a backing for say iSCSI. And it's also at the moment rather catatonic when it comes to VM images. This is mitigated if you set xattr +C at image create time (it must be zero length file for +C to take). But if you cp --reflink or snapshot the containing subvolume, then COW starts to happen for new writes to either copy; overwrites to either copies newly written blocks are nocow. So anyway you can quickly get into complicated states with VM images on Btrfs. I'm not sure of the long term plan.
This is how to set xattr +C at qcow2 create time, only applicable when the qcow2 is on Btrfs.
# qemu-img create -o nocow=on
But really piles more testing is needed to better understand some things with Btrfs and VMs. It's all quite complicated what's going on with these layers. Even though my VM images get monstrous numbers of fragments if I don't use +C, I haven't yet seen a big performance penalty as a result when the host and guest are using Btrfs and the cache is set to unsafe. Now, you might say, that's crazy! It's called unsafe for a reason! Yes, but I've also viscously killed the VM while writes were happening and at most I lose a bit of data that was in flight, the guest fs is not corrupt at all, not even any complaints on the remount. I've got limited testing killing the host while the writes are happening, and there is more data loss due to delayed allocation probably, but again the host and guest Btrfs are fine - no mount complaints at all. And you kinda hope the host isn't often dying...
NTFS in qcow2 on Btrfs without +C however? From Btrfs list anecdote this combination appears to cause hundreds of thousands of fragments in short order, and serious performance penalties. But I haven't tested this. I'm guessing something about NTFS journalling and flushing, and suboptimal cache setting for libvirt is probably causing too aggressive flushes to disk and each flush is a separate extent. Just a guess.