[CentOS] Ext4 on CentOS 5.5 64bit

On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote:
> On Dec 23, 2010, at 2:12 AM, cpolish at surewest.net wrote:
>
>> Matt wrote:
>>> Is ext4 stable on CentOS 5.5 64bit?  I have an email server with a
>>> great deal of disk i/o and was wandering if ext4 would be better then
>>> ext3 for it?
>>
>> Before committing to ext4 on a production server, it
>> would be good to consider the comments made in
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>> As I read it, Ts'o argues that the apparent loss of stability
>> compared to ext3 is a design issue in the realm of applications
>> that run atop it. I hope this is not a misreading.
>
> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.

Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata 
and switching from storing configuration in thousands of small files to 
everything in one sqlite database or similar single file solution 2) you 
bet applications that are data sensitive better be properly written and 
making proper and efficient use of system calls such as fsync, fsyncdata 
and whatever else there is and 3) write barriers were introduced to 
ensure that fsync/fsyncdata do not lie unlike the previous behaviour 
where they return before data is safely written to media. In the case of 
email, you bet the entire toolchain better do fsync. postmark from 
Netapp as a benchmark for mail delivery was completely laughable because 
it does not use a single fsync call whereas all mta credible software 
(sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed. 
Unless you want thousands of zero'd files in say the mail queue, you 
better make sure that both the app and the filesystem do what they are 
supposed to do. Which is use fsync/fsyncdata and filesystem must support 
write barriers if disk write caches are to be left on or disable disk 
write caches and take big performance hit.

If a filesystem does not support write barriers (like JFS) you bet it is 
a concern to take note of with regard to your hardware (eg: do you have 
hardware raid with sufficient BBU cache?). Then there is the case of 
running on top of LVM which I suspect does not have write barrier 
support backported to RHEL/Centos 5.5.

>
> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems.
>

More precisely, the journal is committed every 5 seconds no matter what 
the mode.

I'd stick with ext3 + data=journal with the journal either on some uber 
fast and large external BBU nvram block device (you can get up to 1TB 
with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware 
raid with sufficient BBU cache for an email server. Or anything with 
barrier support through the entire chain (read: no LVM).