On Friday, December 24, 2010 01:03 AM, Les Mikesell wrote:
On 12/23/2010 10:28 AM, Christopher Chan wrote:
Matt wrote:
Is ext4 stable on CentOS 5.5 64bit? I have an email server with a great deal of disk i/o and was wandering if ext4 would be better then ext3 for it?
Before committing to ext4 on a production server, it would be good to consider the comments made in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45 which presumably still apply to current CentOS 5.5 64-bit kernels. As I read it, Ts'o argues that the apparent loss of stability compared to ext3 is a design issue in the realm of applications that run atop it. I hope this is not a misreading.
Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
But note that the reason applications don't use fsync() when they should is probably due to linux historically not implementing it in a reasonable way (i.e. it would flush the entire filesystem buffer and wait for completion instead of just the requested file's outstanding blocks). Not sure when/if that was fixed - but it is also probably behind the old impressions that mysql is faster than postgresql.
Can we drop the fsync nonsense?
No, if you don't remember history you are doomed to repeat it.
Well, come to think of it, I guess most open source apps are developed on Linux and so its implementation does colour how devs think about fsync...nevermind that it is done properly on the BSDs and UNIXes
Apps that are data sensitive should be using fsync/fsyncdata (fsync is a posix specification so the history of how linux implemented fsync has nothing to do with whether applications used it or not) otherwise it should not be even consider for the task. The lying fsync/fsyncdata was fixed when write barrier support was introduced and filesystems updated to use write barriers. As for the flush entire buffer...IIRC, that is specific to ext3 and even that should be now gone with the update to write barrier support.
It's one of those 'have you stopped beating your wife things'. Apps that correctly used fsync were slow because of the OS implementation, so people used other apps. So now you have popular apps that do things wrong.
Yeah, funny how ext3 managed to become the dominant Linux filesystem when it was the one with the flush everything quirk and at a time when fsync did not really honour the 'yes it is safely on the platters' maxim. Let's thank Redhat for this mess.