[CentOS] Centos-4.3: Filelocking problems under high [network related] load with kernel 2.6.9-42.0.3.ELsmp

Mon Nov 27 23:42:47 UTC 2006
Martin Knoblauch <spamtrap at knobisoft.de>

--- Kevan Benson <kbenson at a-1networks.com> wrote:

> On Monday 27 November 2006 10:54, Martin Knoblauch wrote:
> >  first of all, please CC me on any reply, as I am only subscribed
> to
> > the digest.
> >
> > OK. Here is the problem. Said kernel (from 4.4) seems to have
> problems
> > with file-locking when the system is under high, likely network
> > related,
> > load. The symptoms are things using file locking (rpm, the
> user-space
> > automounter amd) fail to obtain locks, usually stating timeout
> > problems.
> >
> > The sytem in question is a HP/DL380G4 with dual-single-core EM64T
> CPUs
> > and 8GB of Memory. The network interfaces are "tg3". It happens
> with
> > both CentOs and RHEL4.
> >
> > The high load can be triggered by copying three 3 GB files in
> parallel
> > from an NFS server (Solaris10, NFS, TCP, 1GBit) to another NFS
> server
> > (RHEL4, NFS, TCP, 100 MBit). The measured network performance is
> OK.
> > During this operation the systems goes to Loads around/above 10.
> > Overall responsiveness feels good, but software doing file-locking
> or
> > stuff like opening a new ssh connection take extremely long.
> >
> > So, if anyone has an idea or hint, it will be highly appreciated.
> 
Hi Kevan,

> NFS has known problems with flock.  man flock(2) specifically notes
> this. Which file locking mechanism (flock or fcntl) does your
> system use predominantly (that is, how do the applications that
> uses NFS lock their files)?  
> 

 Now, amd uses fcntl to do the locking of /etc/mtab. But I do not
believe that the locking problem is restricted to NFS. To me it looks
like all kind of files are affected.

> NFS v4 has some major strides towards better locking, but it's been
> long enough since I dealt with this that I'm not sure if it actually
> solves anything (although it looks like it does).
> You might want to try NFS v4 if possible.
> 

 Due to the environment (a very high number of potential NFS servers,
all running NFSv3) V4 is not an option. And, as I said, the locking
problems occur on local files. NFS is only related to the high-load
condition that seems to accompany the locking problems.

Cheers
Martin

------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de