On Mon, Dec 28, 2009 at 12:46:24PM -0600, Tom Bishop wrote:
Thanks for the explanation, looks like I need to go read some more about barriers to truly understand what is going on.....
(Please don't top post on these lists; thanks!)
As I understand it (but I could be wrong)... The problem is with "out of order writes".
Typically with a journaled filesystem (like ext3) the system will write out a datablock, then update the metadata (allocation tables, etc) to reflect this. This order is important; the data must get to disk before the metadata. Smart hardware, however, can optimise the writes so it's possible for the metadata to get written to disk before the actual datablocks; the result is potential data corruption (eg blocks allocated with garbage in them) as opposed to potential data loss (eg blocks not allocated) if the system dies with unwritten data in the buffer.
The work around for this is "barriers"; the system attempts to flush the buffer to disk to ensure the datablock is written before the metadata. Now blocks are written in the right order, but performance is lower (flush flush).
"Barriers" are not currently implemented in the RHEL kernel for many types of block device (including LVM devices).