[CentOS] XFS and LVM2 (possibly in the scenario of snapshots)

Thu Dec 10 06:54:13 UTC 2009

thus Ross Walker spake:
> On Dec 9, 2009, at 10:39 AM, Timo Schoeler  
> <timo.schoeler at riscworks.net> wrote:
> 
>> thus Ross Walker spake:
>>> On Dec 9, 2009, at 8:05 AM, Timo Schoeler
>>> <timo.schoeler at riscworks.net> wrote:
>>>
>>>> Hi list,
>>>>
>>>> during the last days there was a discussion going on about the
>>>> stability
>>>> of XFS; though I myself used XFS heavily and didn't run into issues
>>>> yet,
>>>> I'd like to ask something *before* we create our next generation  
>>>> data
>>>> storage backend...
>>>>
>>>> Les Mikesell wrote in [0] about issues in the combination of XFS and
>>>> LVM
>>>> -- however, it was being discussed in context of using 32bit  
>>>> kernels.
>>>>
>>>> What I specifically need is to run XFS (or something similar, I am
>>>> *not*
>>>> forced to use XFS, but it was my preference for some years now,  
>>>> and I
>>>> didn't have any issues with it yet) on top of LVM to be able to  
>>>> create
>>>> snapshots. We're talking about several file systems of a size at  
>>>> about
>>>> 4TiByte each.
>>>>
>>>> On another place [1] I read that there were issues with that.
>>>>
>>>> Can anyone shed some light on this? Would be very appreciated.
>>> There is no problem if it is done on x86_64 with it's 8k stack  
>>> frames,
>>> but on i386 with it's 4k stack frames you could run into a stack
>>> overflow when doing it on top of stackable block devices (md raid,
>>> lvm, drbd, etc).
>>>
>>> Also since the current LVM on CentOS doesn't support barriers (next
>>> release I believe) journalling isn't safe on LVM unless you are using
>>> a storage controller with BBU write-back cache.
>>>
>>> I have heard anyways that the current implementation of barriers  
>>> isn't
>>> very performant and doesn't take into consideration controllers with
>>> BBU cache, so most people will end up mounting with nobarriers which
>>> just means they are in the same boat as they are now. Better make  
>>> sure
>>> your machine is bullet proof as a power outage or a kernel panic can
>>> spell disaster for XFS (or any other file system really).
>>>
>>> It is better to invest in a good hardware RAID controller until the
>>> whole barriers stuff is ironed out. It should really perform better
>>> then it does.
>> Thanks for your detailed explanation, that really clears things up;
>> however, I was intending to build a software RAID10 as we had really  
>> not
>> so good experiences on hw RAID controllers int the past (for all kinds
>> of phenomena).
>>
>> Would barriering here still be a problem then?
> 
> So long as LVM isn't involved it will use barriers, but I can tell you  
> you will be less then impressed by the performance.
> 
> Go for hardware RAID with BBU write-cache, go for a good hardware RAID  
> solution, look to spend $350-$700 get one that supports SAS and SATA.  
> I like the LSI MegaRAID cards with 512MB of battery backed cache.
> 
> Some cards allow you to run in JBOD mode with battery backed write- 
> back cache enabled, so if you really want software RAID you can run it  
> and still have fast, safe performance (though you spread the cache a  
> little thin across that many logical units).

Thanks for your eMail, Ross. So, reading all the stuff here I'm really 
concerned about moving all our data to such a system. The reason we're 
moving is mainly, but not only the longisch fsck UFS (FreeBSD) needs 
after a crash. XFS seemed to me to fit perfectly as I never had issues 
with fsck here. However, this discussion seems to change my mindset. So, 
what would be an alternative (if possible not using hardware RAID 
controllers, as already mentioned)? ext3 is not, here we have long fsck 
runs, too. Even ext4 seems not too good in this area...

> -Ross

Timo