[CentOS] extremely slow NFS performance under 6.x [SOLVED]

Mon Nov 4 18:23:40 UTC 2013
Les Mikesell <lesmikesell at gmail.com>

On Mon, Nov 4, 2013 at 12:06 PM,  <m.roth at 5-cent.us> wrote:
> I've posted here about this a number of times. The other admin I work with
> had been playing with it recently, with some real problems we'd been
> having, and this time, with a year or so's more stuff to google, and newer
> documentation, found the problem.
> What we'd been seeing: cd to an NFS-mounted directory, and from an
> NFS-mounted directory, tar -xzvf a 25M or so tar.gz, which unpacks to
> about 105M. Under CentOS 5, on a local drive, seconds; doing the above,
> about 35 seconds. Mount options included sync. Under 6.x, from the
> beginning, it was 6.5 to 7 *minutes*.
> The result was that we'd been keeping our home directory servers on 5.
> What he found was the mount option barrier. According to one or two hits I
> found, googling, it's not clear that 5.x even recognizes this option. From
> upstream docs,
> <https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/writebarrieronoff.html>,
> it's enabled by default, and affects *all* journalled filesystems.
> Remounting the drive with -o nobarrier, I just NFS mounted an exported
> directory... and it took 20 seconds.
> Since most of our systems are all on UPSes, we're not worried about sudden
> power loss... and my manager did a jig, and we're starting to talk about
> migrating the rest of our home directory servers....

I'm trying to make sense of that timing.   Does that mean that
pre-6.x, fsync() didn't really wait for the data to be written to
disk, or does it somehow take 7 minutes to get 100M onto your disk in
the right order?   Or is this an artifact of a specific raid
controller and what you have to do to flush its cache?

   Les Mikesell
     lesmikesell at gmail.com