[CentOS-devel] More XFS Filesystem Tips

Tue May 1 15:02:17 UTC 2007
Joshua Kramer <josh at globalherald.net>

Hello All,

This was posted to the Postgres Admin list - it has some valuable tips wrt 
XFS usage.  I haven't had time yet to see if the C5 kernel conforms to 
what Hannes is talking about, but it looks like XFS is *very* picky about 
having hardware working correctly (and raid arrays set up correctly).

Cheers,
-J

---------- Forwarded message ----------
Date: Tue, 01 May 2007 16:35:28 +0200
From: Hannes Dorbath <light at theendofthetunnel.de>
To: Adam Witney <awitney at sgul.ac.uk>, pgsql-admin at postgresql.org
Subject: Re: [ADMIN] File systems linux !!!

Adam Witney wrote:
> Could you give a couple of examples of things that could be done wrong? I
> have XFS running for my data partition, but I didn't really do much when I
> set it up...
>
> Thanks for any advice

1.) Don't run XFS on any hardware that's not proven to be 100% fsync/fua
safe. It's extremely unforgiving in that regard. Double check your raid
controller settings and then test with something like
http://www.faemalia.net/mysqlUtils/diskTest.pl

2.) Don't run it with a 4K stacks kernel. Though most issues with 4k
stacks have been fixed long ago, there is an 4k stack fix in 2.6.21
release notes yet again. I just wouldn't trust it 100% yet. Especially
avoid to run with 4k stacks in production if you are used to stack block
devices on top of each other (LVM, EVMS, DRBD, GNBD etc).

3.) Use the deadline I/O scheduler, anticipatory and XFS don't like each
other. This is true for almost any FS != ext3. This makes a difference
especially for OLTP.

4.) Don't use stripe alignment unless you are 100% sure on how to
calculate that numbers for your raid setup. No stripe alignment is
always better than a wrong alignment. Some controllers don't like it at
all and degrade in performance.

5.) Make sure to use write barriers unless you run on a hardware
controller with BBU. Actually this is XFS default these days, but it
gets disabled if you have any block device in your stack that doesn't
support it. An example is DRBD (though write barriers are on the road map)

6.) Flushing data is the sole responsibility of the application. XFS
does nothing to help broken applications, like ext3 can do with
data=ordered or data=journal.

XFS uses writeback exclusively. Don't run anything that does not conform
to ACID on XFS. This is fine for PostgreSQL, but might not be fine for
all your applications.

7.) Check dmesg for XFS messages and be able to interpret them.
Especially something about "CORRUPTED_GOTO". If you see such a line
chances are high that 1.) is not true. This is a cry from XFS to run
xfs_repair ASAP, the file system was only mounted to keep your box
online and will shutdown immediately if any suspicious position is
accessed. Take XFS messages serious and Google for them if your are not
sure what they mean.

8.) Grab all those nice PDFs at
http://oss.sgi.com/projects/xfs/training/index.html
These are essential readings for any XFS admin.


IMHO XFS is a mature and rock stable file system, however your really
need to obey the things above. It's just not the
general-purpose-mkfs-and-forget-FS like what ext3 is claimed to be.

What I recommend for a PostgreSQL production box is to use ext3
data=journal for / and XFS for $PGDATA. This should give you a system
which really behaves nice on power failures. ext3 data=journal for all
non ACID applications, XFS for ACID applications.

Personally I use XFS on / as well, but I have taken some steps to make
it behave like I wish.


-- 
Best regards,
Hannes Dorbath

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend