> I can accept faster in certain cases but if you say HUGELY faster, I would like to see some numbers. Ok, first a specific case that actually came up at my work in the last week..... We've got a middleware messaging application we developed we'll call a 'republisher'. it recieves a stream of 'event' messages from a upstream server, and forwards them to a number of downstream servers that have registered ('subscribed') with it. it queues for each downstream subscriber so if they are down, it will hold event messages for it. not all servers want all 'topics', so we only send each server the specific event types its interested in. it writes the incoming event stream to a number of subscriber queue files as well as a series of journal files to track all this state. there is a queue for each incoming 'topic', its entries aren't cleared til the last subscriber on that topic has confirmed delivery. {before someone screams my-favorite-MQ-server, let me just say, we HAD a commercial messaging system doing this, and our production operations staff is fed up with realtime problems that involve multinational vendor finger pointing, so we have developed our own to replace it} On a typical dual xeon linux server running CentOS 4.4, with a simple direct connect disk, this republisher can easily handle 1000 messages/second using simple write(). However, if this process is busy humming away under the production workload of 60-80 messages/sec, and the power is pulled or the server crashes (this happened exactly once so far at a Thailand manufacturing facility, due to operator error), it lost 2000+ manufacturing events that the downstream servers couldn't easily recover, this was data in Linux's disk cache that hadn't yet been commited to disk. So, the obvious solution is to call fsync() on the various files after each 'event' has been processed, to insure its an atomic operation. However, if this republisher does an fsync() after each event, it slows to like 50/second on a direct connect disk. If its run on a similar server with RAID controller that has battery-protected writeback cache enabled, it can easily do 700-800/second. We need 100+/second and prefer 200/second to have margins for catchup after data interruptions. now, everything I've described above is a rather unusual application... so let me present a far more common scenarios... Relational DB Management Servers, like Oracle, or PostgreSQL. when the RDBMS does a 'commit' at transaction END;, the server HAS to fsync its buffers to disk to maintain data integrity. With a writeback cache disk controller, the controller can acknowlege the writes as soon as the data is in its cache, then it can write that data to disk at its leisure. With software RAID, the server has to wait until ALL drives of the RAID slice have seeked, and completed the physical writes to the disk. In a write intensive database, where most of the read data is cached to memory, this is a HUGE performance hit.