[CentOS] LVM hatred, was Re: /boot on a separate partition?

Wed Jun 24 18:22:22 UTC 2015
Gordon Messmer <gordon.messmer at gmail.com>

On 06/23/2015 09:00 PM, Marko Vojinovic wrote:
> On Tue, 23 Jun 2015 19:08:24 -0700
> Gordon Messmer <gordon.messmer at gmail.com> wrote:
>> 1) LVM makes MBR and GPT systems more consistent with each other,
>> reducing the probability of a bug that affects only one.
>> 2) LVM also makes RAID and non-RAID systems more consistent with each
>> other, reducing the probability of a bug that affects only one.
> OTOH, it increases the probability of a bug that affects LVM itself.

No, it doesn't.  As Anaconda supports more types of disk and filesystem 
configuration, its complexity increases, which increases the probability 
that there are bugs.  The number of users is not affected by complexity 
growth, but the permutations of possible configurations grows.  
Therefore, the number of users of some configurations is smaller, which 
means that there are fewer people testing the edge cases, and bugs that 
affect those edge cases are likely to last longer.

Consistency reduces the probability of bugs.

> But really, these arguments sound like a strawman. It reduces the
> probability of a bug that affects one of the setups --- I have a hard
> time imagining a real-world usecase where something like that can be
> even observable, let alone relevant.

Follow anaconda development if you need further proof.

>> 3) MBR has silly limits on the number of partitions, that don't
>> affect LVM.  Sure, GPT is better, but so long as both are supported,
>> the best solution is the one that works in both cases.
> That only makes sense if I need a lot of partitions on a system that
> doesn't support GPT.

You are looking at this from the perspective of you, one user.   I am 
looking at this from the perspective of the developers who manage 
anaconda, and ultimately have to support all of the users.

That is, you are considering an anecdote, and missing the bigger picture.

LVM is an inexpensive abstraction from the specifics of disk 
partitions.  It is more flexible than working without it.  It is 
consistent across MBR, GPT, and RAID volumes underlying the volume 
group, which typically means fewer bugs.

>> 4) There are lots of situations where you might want to expand a
>> disk/filesystem on a server or virtual machine.  Desktops might do so
>> less often, but there's no specific reason to put more engineering
>> effort into making the two different.  The best solution is the one
>> that works in both cases.
> What do you mean by engineering effort? When I'm setting up a data
> storage farm, I'll use LVM. When I'm setting up my laptop, I won't.
> What effort is there?

The effort on the part of the anaconda and dracut developers who have to 
test and support various disk configurations.  The more consistent 
systems are, the fewer bugs we hit.

> I just see it as an annoyance of having to
> customize my partition layout on the laptop, during the OS installation
> (customizing a storage farm setup is pretty mandatory either way, so
> it doesn't make a big difference).

In my case, I set up all of my systems with kickstart and they all have 
the same disk configuration except for RAID.  Every disk in every system 
has a 200MB partition, a 1G partition, and then a partition that fills 
the rest of the disk.  On laptops, that's the EFI partition, /boot, and 
a PV for LVM.  On a BIOS system, it's a bios_grub partition, /boot, and 
a PV for LVM.  On a server, the second and third are RAID1 or RAID10 
members for sets that are /boot and a PV for LVM. Because they all have 
exactly the same partition set, when I replace a disk in a server, a 
script sets up the partitions and adds them to the RAID sets.  With less 
opportunity for human error, my system is more reliable, it can be 
managed by less experienced members of my team, and management takes 
less time.

When you manage hundreds of systems, you start to see the value of 
consistency.  And you can't get to the point of managing thousands 
without it.

>> 5) Snapshots are the only practical way to get consistent backups,
>> and you should be using them.
> That depends on what kind of data you're backing up. If you're backing
> up the whole filesystem, than I agree. But if you are backing up only
> certain critical data, I'd say that a targeted rsync can be waaay more
> efficient.

You can use a targeted rsync from data that's been snapshotted, so 
that's not a valid criticism.  And either way, if you aren't taking 
snapshots, you aren't guaranteed consistent data.  If you rsync a file 
that's actively being written, the destination file may be corrupt.  The 
only guarantee of consistent backups is to quiesce writes, take a 
snapshot, and back up from the snapshot volume.

>> LVM has virtually zero cost, so there's no practical benefit to not
>> using it.
> If you need it. If you don't need it, there is no practical benefit of
> having it, either. It's just another potential point of failure, waiting
> to happen.

The *cost* the same whether you need it or not.  The value changes, but 
the cost is the same.  Cost and value are different things.  LVM has 
virtually zero cost, so even if you think you don't need it, you don't 
lose anything by having it.

Hypothetically, as it is a software component, there could be a bug that 
affects it.  But that ignores the context in which it exists. LVM is the 
standard, default storage layer for Red Hat and derived systems.  It is 
the most tested configuration.  If you want something that's less likely 
to fail, it's the obvious choice.

>>> (2) It is fragile. If you have data on top of LVM spread over an
>>> array of disks, and one disk dies, the data on the whole array goes
>>> away.
>> That's true of every filesystem that doesn't use RAID or something
>> like it.  It's hardly a valid criticism of LVM.
> If you have a sequence of plain ext4 harddrives with several symlinks,
> and one drive dies, you can still read the data sitting on the other
> drives. With LVM, you cannot. It's as simple as that.
>
> In some cases it makes sense to maintain access to reduced amount of
> data, despite the fact that a chunk went missing. A webserver, for
> example, can keep serving the data that's still there on the healthy
> drives, and survive the failure of the faulty drive without downtime.
> OTOH, with LVM, once a single drive fails, the server looses access to
> all data, which then necessitates some downtime while switching to the
> backup, etc. LVM isn't always an optimal solution.

Unless it's the disk with your root filesystem that fails.

Your argument is bizarre.  If you are concerned with reliability, use 
RAID until you decide that btrfs or zfs are ready.