Cnetos 5.4 ext3 question...

List overview All Threads
Download

newer

older

RHCE

Hardening

Tom Bishop

28 Dec 2009 28 Dec '09

4:25 p.m.

I could find this out but at the moment I dont have a copy of 5.4 running, can anyone tell me for a default centos5.4 install are ext3 barriers on by default? Thanks in advance...

Attachments:

attachment.html (text/html — 182 bytes)

Show replies by date

Ray Van Dolson

28 Dec 28 Dec

4:48 p.m.

On Mon, Dec 28, 2009 at 10:25:01AM -0600, Tom Bishop wrote:

...

I could find this out but at the moment I dont have a copy of 5.4 running, can anyone tell me for a default centos5.4 install are ext3 barriers on by default? Thanks in advance...

Pretty sure they are off by default (and were on the box I just checked against).

Ray

Tom Bishop

4:57 p.m.

Thanks much, been reading about ext4 and performance issues, I've never had any issues with ext3 and my centos boxes...anyone else have any corruption issues when running ext3 as far as when power is cut etc...?

On Mon, Dec 28, 2009 at 10:48 AM, Ray Van Dolson rayvd@bludgeon.org wrote:

...

On Mon, Dec 28, 2009 at 10:25:01AM -0600, Tom Bishop wrote:

...
I could find this out but at the moment I dont have a copy of 5.4

running,

...
can anyone tell me for a default centos5.4 install are ext3 barriers on

by

...
default? Thanks in advance...

Pretty sure they are off by default (and were on the box I just checked against).

Ray _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Ray Van Dolson

5 p.m.

On Mon, Dec 28, 2009 at 10:57:10AM -0600, Tom Bishop wrote:

...

Thanks much, been reading about ext4 and performance issues, I've never had any issues with ext3 and my centos boxes...anyone else have any corruption issues when running ext3 as far as when power is cut etc...?

Never had anything unrecoverable happen -- and have run servers in some fairly harsh environments without decent UPS'es with a fair bit of IO load on them as well.. :)

Not to say you should't go for as much power redundancy as possible!

Ray

david＠pnyet.web.id

5:03 p.m.

I'm using ext3 on my CentOS box, so far so good, I don't get any problem. Sometimes my server shutdown when power is cut, but CentOS still running well and nothing corruption files or anything after start again.

Ds.

-----Original Message----- From: Tom Bishop bishoptf@gmail.com Date: Mon, 28 Dec 2009 10:57:10 To: CentOS mailing listcentos@centos.org Subject: Re: [CentOS] Cnetos 5.4 ext3 question...

_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Tom Bishop

5:07 p.m.

Thanks guys for the responses, can anyone explain what the hoopla is then about ext4 and performance issues and barriers being enabled, there was also some talk about that being an potential issue with ext3? I've tried to google and look but have not found a good explanation on what the issue is....

On Mon, Dec 28, 2009 at 11:03 AM, david@pnyet.web.id wrote:

...

I'm using ext3 on my CentOS box, so far so good, I don't get any problem. Sometimes my server shutdown when power is cut, but CentOS still running well and nothing corruption files or anything after start again.

Ds.

-----Original Message----- From: Tom Bishop bishoptf@gmail.com Date: Mon, 28 Dec 2009 10:57:10 To: CentOS mailing listcentos@centos.org Subject: Re: [CentOS] Cnetos 5.4 ext3 question...

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Ross Walker

6:22 p.m.

On Dec 28, 2009, at 12:07 PM, Tom Bishop bishoptf@gmail.com wrote:

...

On Mon, Dec 28, 2009 at 11:03 AM, david@pnyet.web.id wrote: I'm using ext3 on my CentOS box, so far so good, I don't get any problem. Sometimes my server shutdown when power is cut, but CentOS still running well and nothing corruption files or anything after start again. Thanks guys for the responses, can anyone explain what the hoopla is then about ext4 and performance issues and barriers being enabled, there was also some talk about that being an potential issue with ext3? I've tried to google and look but have not found a good explanation on what the issue is....

Barriers expose the poor performance of cheap hard drives. They provide assurance that all the data leading up to the barrier and the barrier IO itself are committed to media. This means that the barrier does a disk flush first and if the drive supports FUA (forced unit access, ie bypass cache), then issues the IO request FUA, if the drive doesn't support FUA then it issues another cache flush. It's the double flush that causes the most impact to performance.

The typical fsync() call only assures data is flushed from memory, but makes no assurance the drive itself has flushed it to disk which is where the concern lies.

Currently in RHEL/CentOS the LVM (device mapper) layer doesn't know how to propogate barriers to the underlying devices so it filters them out, so barriers are only currently supported on whole drives or raw partitions. This is fixed in the current kernels, but has yet to be backported to RHEL kernels.

There are a couple of ways to avoid the barrier penalty. One is to have nvram backed write-cache either on the contoller or as a separate pass-through device. The other is to use a separate log device on a SSD which has nvram cache, newer ones have capacitor backed cache or a standalone nvram drive.

-Ross

Les Mikesell

6:41 p.m.

Ross Walker wrote:

...

On Dec 28, 2009, at 12:07 PM, Tom Bishop <bishoptf@gmail.com mailto:bishoptf@gmail.com> wrote:

...
On Mon, Dec 28, 2009 at 11:03 AM, < mailto:david@pnyet.web.iddavid@pnyet.web.id mailto:david@pnyet.web.id> wrote:
I'm using ext3 on my CentOS box, so far so good, I don't get any
problem. Sometimes my server shutdown when power is cut, but
CentOS still running well and nothing corruption files or anything
after start again.
Thanks guys for the responses, can anyone explain what the hoopla is then about ext4 and performance issues and barriers being enabled, there was also some talk about that being an potential issue with ext3? I've tried to google and look but have not found a good explanation on what the issue is....
Barriers expose the poor performance of cheap hard drives. They provide assurance that all the data leading up to the barrier and the barrier IO itself are committed to media. This means that the barrier does a disk flush first and if the drive supports FUA (forced unit access, ie bypass cache), then issues the IO request FUA, if the drive doesn't support FUA then it issues another cache flush. It's the double flush that causes the most impact to performance.

The typical fsync() call only assures data is flushed from memory, but makes no assurance the drive itself has flushed it to disk which is where the concern lies.

Currently in RHEL/CentOS the LVM (device mapper) layer doesn't know how to propogate barriers to the underlying devices so it filters them out, so barriers are only currently supported on whole drives or raw partitions. This is fixed in the current kernels, but has yet to be backported to RHEL kernels.

There are a couple of ways to avoid the barrier penalty. One is to have nvram backed write-cache either on the contoller or as a separate pass-through device. The other is to use a separate log device on a SSD which has nvram cache, newer ones have capacitor backed cache or a standalone nvram drive.

Did linux ever get a working fsync() or does it still flush the entire filesystem buffer?

-- Les Mikesell lesmikesell@gmail.com

Ross Walker

29 Dec 29 Dec

3:58 a.m.

On Dec 28, 2009, at 1:41 PM, Les Mikesell lesmikesell@gmail.com wrote:

...

Did linux ever get a working fsync() or does it still flush the entire filesystem buffer?

Working, meaning reliable, or the ability to sync a memory range instead of the whole file system? There is sync_page_range() to only sync only a range, but I think the biggest issue is whether it actually assures it makes it to disk or just out of memory.

-Ross

Les Mikesell

4:34 a.m.

Ross Walker wrote:

...

On Dec 28, 2009, at 1:41 PM, Les Mikesell lesmikesell@gmail.com wrote:

...
Did linux ever get a working fsync() or does it still flush the entire filesystem buffer?

Working, meaning reliable, or the ability to sync a memory range instead of the whole file system? There is sync_page_range() to only sync only a range, but I think the biggest issue is whether it actually assures it makes it to disk or just out of memory.

I mean the ability to sync only the buffer associated with the single file specified by the file descriptor in the argument - without waiting for a bunch of other unrelated and irrelevant data to sync along with it.

-- Les Mikesell lesmikesell@gmail.com

Ross Walker

5:37 a.m.

On Dec 28, 2009, at 11:34 PM, Les Mikesell lesmikesell@gmail.com wrote:

...

Ross Walker wrote:

...
On Dec 28, 2009, at 1:41 PM, Les Mikesell lesmikesell@gmail.com wrote:

...
Did linux ever get a working fsync() or does it still flush the entire filesystem buffer?

Working, meaning reliable, or the ability to sync a memory range instead of the whole file system? There is sync_page_range() to only sync only a range, but I think the biggest issue is whether it actually assures it makes it to disk or just out of memory.

I mean the ability to sync only the buffer associated with the single file specified by the file descriptor in the argument - without waiting for a bunch of other unrelated and irrelevant data to sync along with it.

I'm pretty sure it just flushes the data associated with the file descriptor.

-Ross

Les Mikesell

5:38 p.m.

Ross Walker wrote:

...

...
wrote:

...
...
Did linux ever get a working fsync() or does it still flush the entire filesystem buffer?

Working, meaning reliable, or the ability to sync a memory range instead of the whole file system? There is sync_page_range() to only sync only a range, but I think the biggest issue is whether it actually assures it makes it to disk or just out of memory.

I mean the ability to sync only the buffer associated with the single file specified by the file descriptor in the argument - without waiting for a bunch of other unrelated and irrelevant data to sync along with it.

I'm pretty sure it just flushes the data associated with the file descriptor.

Maybe I'm thinking of ext2 where it didn't keep track of the directory associated with the file or walk the tree back flushing them. But it still seems to not do it right: http://lwn.net/Articles/270891/

-- Les Mikesell lesmikesell@gmail.com

Tom Bishop

28 Dec 28 Dec

6:46 p.m.

Thanks for the explanation, looks like I need to go read some more about barriers to truly understand what is going on.....

On Mon, Dec 28, 2009 at 12:22 PM, Ross Walker rswwalker@gmail.com wrote:

...

On Dec 28, 2009, at 12:07 PM, Tom Bishop bishoptf@gmail.com wrote:

On Mon, Dec 28, 2009 at 11:03 AM, < david@pnyet.web.iddavid@pnyet.web.id

...
wrote:

...
I'm using ext3 on my CentOS box, so far so good, I don't get any problem. Sometimes my server shutdown when power is cut, but CentOS still running well and nothing corruption files or anything after start again.

Thanks guys for the responses, can anyone explain what the hoopla is then about ext4 and performance issues and barriers being enabled, there was also some talk about that being an potential issue with ext3? I've tried to google and look but have not found a good explanation on what the issue is....

Barriers expose the poor performance of cheap hard drives. They provide assurance that all the data leading up to the barrier and the barrier IO itself are committed to media. This means that the barrier does a disk flush first and if the drive supports FUA (forced unit access, ie bypass cache), then issues the IO request FUA, if the drive doesn't support FUA then it issues another cache flush. It's the double flush that causes the most impact to performance.

The typical fsync() call only assures data is flushed from memory, but makes no assurance the drive itself has flushed it to disk which is where the concern lies.

Currently in RHEL/CentOS the LVM (device mapper) layer doesn't know how to propogate barriers to the underlying devices so it filters them out, so barriers are only currently supported on whole drives or raw partitions. This is fixed in the current kernels, but has yet to be backported to RHEL kernels.

There are a couple of ways to avoid the barrier penalty. One is to have nvram backed write-cache either on the contoller or as a separate pass-through device. The other is to use a separate log device on a SSD which has nvram cache, newer ones have capacitor backed cache or a standalone nvram drive.

-Ross

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Stephen Harris

6:58 p.m.

On Mon, Dec 28, 2009 at 12:46:24PM -0600, Tom Bishop wrote:

...

Thanks for the explanation, looks like I need to go read some more about barriers to truly understand what is going on.....

(Please don't top post on these lists; thanks!)

As I understand it (but I could be wrong)... The problem is with "out of order writes".

Typically with a journaled filesystem (like ext3) the system will write out a datablock, then update the metadata (allocation tables, etc) to reflect this. This order is important; the data must get to disk before the metadata. Smart hardware, however, can optimise the writes so it's possible for the metadata to get written to disk before the actual datablocks; the result is potential data corruption (eg blocks allocated with garbage in them) as opposed to potential data loss (eg blocks not allocated) if the system dies with unwritten data in the buffer.

The work around for this is "barriers"; the system attempts to flush the buffer to disk to ensure the datablock is written before the metadata. Now blocks are written in the right order, but performance is lower (flush flush).

"Barriers" are not currently implemented in the RHEL kernel for many types of block device (including LVM devices).

-- rgds Stephen

5697

Age (days ago)

5698

Last active (days ago)

discuss@lists.centos.org

13 comments

6 participants

tags (0)

participants (6)

david＠pnyet.web.id
Les Mikesell
Ray Van Dolson
Ross Walker
Stephen Harris
Tom Bishop