[CentOS] 40TB File System Recommendations

Thu Apr 14 14:37:15 UTC 2011
Christopher Chan <christopher.chan at bradbury.edu.hk>

On Thursday, April 14, 2011 08:55 PM, Simon Matter wrote:
>> On Thursday, April 14, 2011 09:04 AM, Ross Walker wrote:
>>> On Apr 13, 2011, at 7:26 PM, John Jasen<jjasen at realityfailure.org>
>>> wrote:
>>>
>>>> On 04/12/2011 08:19 PM, Christopher Chan wrote:
>>>>> On Tuesday, April 12, 2011 10:36 PM, John Jasen wrote:
>>>>>> On 04/12/2011 10:21 AM, Boris Epstein wrote:
>>>>>>> On Tue, Apr 12, 2011 at 3:36 AM, Alain Péan
>>>>>>> <alain.pean at lpp.polytechnique.fr
>>>>>>> <mailto:alain.pean at lpp.polytechnique.fr>>    wrote:
>>>>>>
>>>>>> <snipped: two recommendations for XFS>
>>>>>>
>>>>>> I would chime in with a dis-commendation for XFS. At my previous
>>>>>> employer, two cases involving XFS resulted in irrecoverable data
>>>>>> corruption. These were on RAID systems running from 4 to 20 TB.
>>>>>>
>>>>>>
>>>>>
>>>>> What were those circumstances? Crash? Power outage? What are the
>>>>> components of the RAID systems?
>>>>
>>>> One was a hardware raid over fibre channel, which silently corrupted
>>>> itself. System checked out fine, raid array checked out fine, xfs was
>>>> replaced with ext3, and the system ran without issue.
>>>>
>>>> Second was multiple hardware arrays over linux md raid0, also over
>>>> fibre
>>>> channel. This was not so silent corruption, as in xfs would detect it
>>>> and lock the filesystem into read-only before it, pardon the pun, truly
>>>> fscked itself. Happened two or three times, before we gave up, split up
>>>> the raid, and went ext3, Again, no issues.
>>>
>>> Every now and then I hear these XFS horror stories. They seem too
>>> impossible to believe.
>>>
>>> Nothing breaks for absolutely no reason and failure to know where the
>>> breakage was shows that maybe there wasn't adequately skilled
>>> techinicians for the technology deployed.
>>>
>>> XFS if run in a properly configured environment will run flawlessly.
>>>
>>
>> HAHAHAHHHHHHHHAAAAAAAAAAAAAAAHAAAAAAAAAAAAAAAAAAAAA
>>
>> The XFS codebase is the biggest pile of mess in the Linux kernel and you
>> expect it to be not run into mysterious problems? Remember, XFS was
>> PORTED over to Linux. It is not a 'native' thing to Linux.
>
> You're confusing me, I always thought Linux has been ported to XFS :)
>
> There were some issues with XFS and maybe there still are. But, you can
> not say there are no environments where it work very stable. I've started
> using XFS back in the RH7.2 days and I can also tell some stories, but not
> all of them were XFS's fault. The only real problem was the fact that
> RedHat didn't chose XFS as their FS of choice which meant that just a few
> ressources were put into the XFS code and just a few peoples actually used
> it. That's the only thing where ext2,3,4 was better IMHO.
>

Where did I say that there are no environments where it works very 
stable? I used XFS extensively when I was running mail server farms for 
the mail queue filesystem and I only remember one or two incidents when 
the filesystem was marked read-only for no reason (seemingly - never had 
the time to find out why) but a reboot fixed those. XFS was better 
performing then but less reliable (yoohoo, hi Linux fake 
fsync/fdatasync) than ext3. So I personally have not had MAJOR problems 
with XFS but you bet that I don't think it's 100% safe in a properly 
configured environment. But that does not mean I am saying one must 
always encounter issues with it.

Redhat not choosing XFS is because the thing's code base is a quagmire 
and they had no developer familiar with it. Only Suse supported it 
because they could since they had XFS developers on their payroll and 
those developers were kept busy if you ask me.