Reading the "waiting IOs" thread made me remember I have a similar problem
that has been here for months, and I have no sulution yet.
A single CentOS 5.2 x86_64 machine here is overloading our NetApp filer with
excessive NFS getattr, lookup and access operations. The weird thing is that
the number of these operations increases over time. I have an mrtg graph
(which I didn't want to attach here) showing e.g. 200 NFS Ops on Monday,
measured with filer-mrtg, going up to, e.g. 1200 in a straight line within
days. nfsstat -l on the filer proves beyond doubt that the load is caused by
this particular machine. dstat shows me which NFS operations are causing it.
date/time | null gatr satr look aces ...
10-09 12:22:52| 0 0 0 0 0
10-09 12:22:53| 0 525 0 602 602
10-09 12:22:54| 0 1275 0 1464 1438
10-09 12:22:55| 0 0 0 0 0
10-09 12:22:56| 0 0 0 0 0
10-09 12:22:57| 0 0 0 0 0
10-09 12:22:58| 0 238 0 270 270
10-09 12:22:59| 0 1461 0 1663 1660
10-09 12:23:00| 0 205 0 133 114
10-09 12:23:01| 0 0 0 0 0
10-09 12:23:02| 0 1 0 0 0
10-09 12:23:03| 0 0 0 0 0
10-09 12:23:04| 0 1411 0 1574 1574
10-09 12:23:05| 0 498 0 465 466
10-09 12:23:06| 0 0 0 0 0
10-09 12:23:07| 0 0 0 0 0
10-09 12:23:08| 0 0 0 0 0
10-09 12:23:09| 0 1082 0 1178 1192
10-09 12:23:10| 0 790 0 885 865
This behaviour is somehow tied to the Gnome desktop. I have other machines
running CentOS 5.2 x86_64 (at init level 3) which don't show this behaviour.
I also have CentOS 5.2 i386 machines which don't show it either. None of the
other machines on the lan show it - RHEL3 32 and 64bit, Solaris.
What I'd need is a monitoring tool than can tie the NFS ops to process ids
or applications. lsof isn't nearly as helpful here as I thought. I even copied
this workstation user's files to another account, logged in and ran the same
apps - and couldn't reproduce it.
Ideas? Essentially, this makes CentOS 64bit undeployable in our environemnt.