On 06/23/2015 08:10 PM, Marko Vojinovic wrote:
Ok, you made me curious. Just how dramatic can it be? From where I'm sitting, a read/write to a disk takes the amount of time it takes, the hardware has a certain physical speed, regardless of the presence of LVM. What am I missing?
Well, there's best and worst case scenarios. Best case for file-backed VMs is pre-allocated files. It takes up more space, and takes a while to set up initially, but it skips block allocation and probably some fragmentation performance hits later.
Worst case, though, is sparse files. In such a setup, when you write a new file in a guest, the kernel writes the metadata to the journal, then writes the file's data block, then flushes the journal to the filesystem. Every one of those writes goes through the host filesystem layer, often allocating new blocks, which goes through the host's filesystem journal. If each of those three writes hit blocks not previously used, then the host may do three writes for each of them. In that case, one write() in an application in a VM becomes nine disk writes in the VM host.
The first time I benchmarked a sparse-file-backed guest vs an LVM backed guest, bonnie++ measured block write bandwidth at about 12.5% (1/8) native disk write performance.
Yesterday I moved a bunch of VMs from a file-backed virt server (set up by someone else) to one that used logical volumes. Block write speed on the old server, measured with bonnie++, was about 21.6MB/s in the guest and about 39MB/s on the host. So, less bad than a few years prior, but still bad. (And yes, all of those numbers are bad. It's a 3ware controller, what do you expect?)
LVM backed guests measure very nearly the same as bare metal performance. After migration, bonnie++ reports about 180MB/s block write speed.
For concreteness, let's say I have a guest machine, with a dedicated physical partition for it, on a single drive. Or, I have the same thing, only the dedicated partition is inside LVM. Why is there a performance difference, and how dramatic is it?
Well, I said that there's a big performance hit to file-backed guests, not partition backed guests. You should see exactly the same disk performance on partition backed guests as LV backed guests.
However, partitions have other penalties relative to LVM.
1) If you have a system with a single disk, you have to reboot to add partitions for new guests. Linux won't refresh the partition table on the disk it boots from. 2) If you have two disks you can allocate new partitions on the second disk without a reboot. However, your partition has to be contiguous, which may be a problem, especially over time if you allocate VMs of different sizes. 3) If you want redundancy, partitions on top of RAID is more complex than LVM on top of RAID. As far as I know, partitions on top of RAID are subject to the same limitation as in #1. 4) As far as I know, Anaconda can't set up a logical volume that's a redundant type, so LVM on top of RAID is the only practical way to support redundant storage of your host filesystems.
If you use LVM, you don't have to remember any oddball rules. You don't have to reboot to set up new VMs when you have one disk. You don't have to manage partition fragmentation. Every system, whether it's one disk or a RAID set behaves the same way.