[CentOS] Correlate i/o with a process
Dag Wieers
dag at wieers.com
Sun Aug 19 14:36:55 UTC 2007
On Sat, 18 Aug 2007, Dag Wieers wrote:
> On Fri, 17 Aug 2007, Mag Gam wrote:
> > On 8/17/07, John R Pierce <pierce at hogranch.com> wrote:
> > > Mag Gam wrote:
> > >
> > > > I have a server with 2 HBAs, and the users keeps complaining about
> > > > performance problems. My question is, how can I relate the process
> > > > with high I/O wait? Also, is it possible to see how much data is being
> > > > pushed thru by my 2 HBAs?
> > >
> > > iostat (part of the sysstat package) will answer your 2nd question.
> > >
> > > I dunno how to measure io wait time per process. maybe IBM's NMON can
> > > do that, not sure, I haven't used it for a while.
> > > http://www-941.haw.ibm.com/collaboration/wiki/display/WikiPtype/nmon
> >
> > Thanks John.
> >
> > Yes, this is a tricky question, but I face this a lot....Unfortunately, I am
> > not sure how to check the adapter throughput, and what process is causing
> > the i/o wait.
>
> I believe that recent kernels have a patch applied that show io counters
> per process. I haven't looked into it yet though.
>
> This is one of the most important items on my wishlist for dstat, a topio
> plugin next to the existing topcpu and topmem plugins.
I found the following interesting information while googling. Now I need
to find a kernel that provides the counters ;-)
Based on this information I will most likely have topio, topio_real and
topio_ops
2.14 /proc/<pid>/io - Display the IO accounting fields
-------------------------------------------------------
This file contains IO statistics for each running process
Example
-------
test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
[1] 3828
test:/tmp # cat /proc/3828/io
rchar: 323934931
wchar: 323929600
syscr: 632687
syscw: 632675
read_bytes: 0
write_bytes: 323932160
cancelled_write_bytes: 0
Description
-----------
rchar
-----
I/O counter: chars read
The number of bytes which this task has caused to be read from storage. This
is simply the sum of bytes which this process passed to read() and pread().
It includes things like tty IO and it is unaffected by whether or not actual
physical disk IO was required (the read might have been satisfied
pagecache)
wchar
-----
I/O counter: chars written
The number of bytes which this task has caused, or shall cause to be written
to disk. Similar caveats apply here as with rchar.
syscr
-----
I/O counter: read syscalls
Attempt to count the number of read I/O operations, i.e. syscalls like read()
and pread().
syscw
-----
I/O counter: write syscalls
Attempt to count the number of write I/O operations, i.e. syscalls
write() and pwrite().
read_bytes
----------
I/O counter: bytes read
Attempt to count the number of bytes which this process really did cause to
be fetched from the storage layer. Done at the submit_bio() level, so it is
accurate for block-backed filesystems. <please add status regarding NFS and
CIFS at a later time>
write_bytes
-----------
I/O counter: bytes written
Attempt to count the number of bytes which this process caused to be sent to
the storage layer. This is done at page-dirtying time.
cancelled_write_bytes
---------------------
The big inaccuracy here is truncate. If a process writes 1MB to a file and
then deletes the file, it will in fact perform no writeout. But it will have
been accounted as having caused 1MB of write.
In other words: The number of bytes which this process caused to not happen,
by truncating pagecache. A task can cause "negative" IO too. If this task
truncates some dirty pagecache, some IO which another task has been accounted
for (in it's write_bytes) will not be happening. We _could_ just subtract that
from the truncating task's write_bytes, but there is information loss in doing
that.
Note
----
At its current implementation state, this is a bit racy on 32-bit machines: if
process A reads process B's /proc/pid/io while process B is updating one of
those 64-bit counters, process A could see an intermediate result.
More information about this can be found within the taskstats documentation in
Documentation/accounting.
-- dag wieers, dag at wieers.com, http://dag.wieers.com/ --
[Any errors in spelling, tact or fact are transmission errors]
More information about the CentOS
mailing list