[CentOS] Ext4 on CentOS 5.5 64bit

Thu Dec 23 21:25:16 UTC 2010
Ross Walker <rswwalker at gmail.com>

On Dec 23, 2010, at 11:21 AM, Christopher Chan <christopher.chan at bradbury.edu.hk> wrote:

> On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote:
>> On Dec 23, 2010, at 2:12 AM, cpolish at surewest.net wrote:
>> 
>>> Matt wrote:
>>>> Is ext4 stable on CentOS 5.5 64bit?  I have an email server with a
>>>> great deal of disk i/o and was wandering if ext4 would be better then
>>>> ext3 for it?
>>> 
>>> Before committing to ext4 on a production server, it
>>> would be good to consider the comments made in
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>>> As I read it, Ts'o argues that the apparent loss of stability
>>> compared to ext3 is a design issue in the realm of applications
>>> that run atop it. I hope this is not a misreading.
>> 
>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
> 
> Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata 
> and switching from storing configuration in thousands of small files to 
> everything in one sqlite database or similar single file solution 2) you 
> bet applications that are data sensitive better be properly written and 
> making proper and efficient use of system calls such as fsync, fsyncdata 
> and whatever else there is and 3) write barriers were introduced to 
> ensure that fsync/fsyncdata do not lie unlike the previous behaviour 
> where they return before data is safely written to media. In the case of 
> email, you bet the entire toolchain better do fsync. postmark from 
> Netapp as a benchmark for mail delivery was completely laughable because 
> it does not use a single fsync call whereas all mta credible software 
> (sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed. 
> Unless you want thousands of zero'd files in say the mail queue, you 
> better make sure that both the app and the filesystem do what they are 
> supposed to do. Which is use fsync/fsyncdata and filesystem must support 
> write barriers if disk write caches are to be left on or disable disk 
> write caches and take big performance hit.
> 
> If a filesystem does not support write barriers (like JFS) you bet it is 
> a concern to take note of with regard to your hardware (eg: do you have 
> hardware raid with sufficient BBU cache?). Then there is the case of 
> running on top of LVM which I suspect does not have write barrier 
> support backported to RHEL/Centos 5.5.
> 
> 
>> 
>> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems.
>> 
> 
> More precisely, the journal is committed every 5 seconds no matter what 
> the mode.
> 
> I'd stick with ext3 + data=journal with the journal either on some uber 
> fast and large external BBU nvram block device (you can get up to 1TB 
> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware 
> raid with sufficient BBU cache for an email server. Or anything with 
> barrier support through the entire chain (read: no LVM).

I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough.

If you have BBU write cache there is no need to worry about barriers.

-Ross