[CentOS] Ext4 on CentOS 5.5 64bit

Thu Dec 23 21:30:48 UTC 2010
Ross Walker <rswwalker at gmail.com>

On Dec 23, 2010, at 4:25 PM, Ross Walker <rswwalker at gmail.com> wrote:

> On Dec 23, 2010, at 11:21 AM, Christopher Chan <christopher.chan at bradbury.edu.hk> wrote:
> 
>> On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote:
>>> On Dec 23, 2010, at 2:12 AM, cpolish at surewest.net wrote:
>>> 
>>>> Matt wrote:
>>>>> Is ext4 stable on CentOS 5.5 64bit?  I have an email server with a
>>>>> great deal of disk i/o and was wandering if ext4 would be better then
>>>>> ext3 for it?
>>>> 
>>>> Before committing to ext4 on a production server, it
>>>> would be good to consider the comments made in
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>>>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>>>> As I read it, Ts'o argues that the apparent loss of stability
>>>> compared to ext3 is a design issue in the realm of applications
>>>> that run atop it. I hope this is not a misreading.
>>> 
>>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
>> 
>> Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata 
>> and switching from storing configuration in thousands of small files to 
>> everything in one sqlite database or similar single file solution 2) you 
>> bet applications that are data sensitive better be properly written and 
>> making proper and efficient use of system calls such as fsync, fsyncdata 
>> and whatever else there is and 3) write barriers were introduced to 
>> ensure that fsync/fsyncdata do not lie unlike the previous behaviour 
>> where they return before data is safely written to media. In the case of 
>> email, you bet the entire toolchain better do fsync. postmark from 
>> Netapp as a benchmark for mail delivery was completely laughable because 
>> it does not use a single fsync call whereas all mta credible software 
>> (sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed. 
>> Unless you want thousands of zero'd files in say the mail queue, you 
>> better make sure that both the app and the filesystem do what they are 
>> supposed to do. Which is use fsync/fsyncdata and filesystem must support 
>> write barriers if disk write caches are to be left on or disable disk 
>> write caches and take big performance hit.
>> 
>> If a filesystem does not support write barriers (like JFS) you bet it is 
>> a concern to take note of with regard to your hardware (eg: do you have 
>> hardware raid with sufficient BBU cache?). Then there is the case of 
>> running on top of LVM which I suspect does not have write barrier 
>> support backported to RHEL/Centos 5.5.
>> 
>> 
>>> 
>>> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems.
>>> 
>> 
>> More precisely, the journal is committed every 5 seconds no matter what 
>> the mode.
>> 
>> I'd stick with ext3 + data=journal with the journal either on some uber 
>> fast and large external BBU nvram block device (you can get up to 1TB 
>> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware 
>> raid with sufficient BBU cache for an email server. Or anything with 
>> barrier support through the entire chain (read: no LVM).
> 
> I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough.
> 
> If you have BBU write cache there is no need to worry about barriers.

Barrier deprecation link:

http://www.linux-archive.org/device-mapper-development/412783-block-deprecate-barrier-replace-blk_queue_ordered-blk_queue_flush.html

-Ross