On Dec 23, 2010, at 4:25 PM, Ross Walker <rswwalker at gmail.com> wrote: > On Dec 23, 2010, at 11:21 AM, Christopher Chan <christopher.chan at bradbury.edu.hk> wrote: > >> On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote: >>> On Dec 23, 2010, at 2:12 AM, cpolish at surewest.net wrote: >>> >>>> Matt wrote: >>>>> Is ext4 stable on CentOS 5.5 64bit? I have an email server with a >>>>> great deal of disk i/o and was wandering if ext4 would be better then >>>>> ext3 for it? >>>> >>>> Before committing to ext4 on a production server, it >>>> would be good to consider the comments made in >>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45 >>>> which presumably still apply to current CentOS 5.5 64-bit kernels. >>>> As I read it, Ts'o argues that the apparent loss of stability >>>> compared to ext3 is a design issue in the realm of applications >>>> that run atop it. I hope this is not a misreading. >>> >>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes. >> >> Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata >> and switching from storing configuration in thousands of small files to >> everything in one sqlite database or similar single file solution 2) you >> bet applications that are data sensitive better be properly written and >> making proper and efficient use of system calls such as fsync, fsyncdata >> and whatever else there is and 3) write barriers were introduced to >> ensure that fsync/fsyncdata do not lie unlike the previous behaviour >> where they return before data is safely written to media. In the case of >> email, you bet the entire toolchain better do fsync. postmark from >> Netapp as a benchmark for mail delivery was completely laughable because >> it does not use a single fsync call whereas all mta credible software >> (sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed. >> Unless you want thousands of zero'd files in say the mail queue, you >> better make sure that both the app and the filesystem do what they are >> supposed to do. Which is use fsync/fsyncdata and filesystem must support >> write barriers if disk write caches are to be left on or disable disk >> write caches and take big performance hit. >> >> If a filesystem does not support write barriers (like JFS) you bet it is >> a concern to take note of with regard to your hardware (eg: do you have >> hardware raid with sufficient BBU cache?). Then there is the case of >> running on top of LVM which I suspect does not have write barrier >> support backported to RHEL/Centos 5.5. >> >> >>> >>> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems. >>> >> >> More precisely, the journal is committed every 5 seconds no matter what >> the mode. >> >> I'd stick with ext3 + data=journal with the journal either on some uber >> fast and large external BBU nvram block device (you can get up to 1TB >> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware >> raid with sufficient BBU cache for an email server. Or anything with >> barrier support through the entire chain (read: no LVM). > > I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough. > > If you have BBU write cache there is no need to worry about barriers. Barrier deprecation link: http://www.linux-archive.org/device-mapper-development/412783-block-deprecate-barrier-replace-blk_queue_ordered-blk_queue_flush.html -Ross