Is there anyway to tell with top or iostat which process is hogging all the disk I/O?
Matt
Matt wrote:
Is there anyway to tell with top or iostat which process is hogging all the disk I/O?
iostat won't drill down to per process statistics
`top` or `ps` shows any processes in heavy iowait...
or, maybe lsof could give you some clues (as it lists all open files)
On Mon, 2009-01-12 at 10:43 -0800, John R Pierce wrote:
Matt wrote:
Is there anyway to tell with top or iostat which process is hogging all the disk I/O?
<snip>
IMHO,
The best way is to install and use the System Activity Reporting (SAR) system. It's native, available, as detailed or general as you want and allows a lot of way to "slice" the results. Frequency of sampling is also under your control to allow chasing those particularly nasty "extraneous events" that might be frequent.
HTH
William L. Maltby wrote on Mon, 12 Jan 2009 14:08:40 -0500:
The best way is to install and use the System Activity Reporting (SAR) system.
AFAIK, it won't show any file-specific stuff.
Kai
On Mon, 2009-01-12 at 23:31 +0100, Kai Schaetzl wrote:
William L. Maltby wrote on Mon, 12 Jan 2009 14:08:40 -0500:
The best way is to install and use the System Activity Reporting (SAR) system.
AFAIK, it won't show any file-specific stuff.
It's been so long, I don't remember now. But I believe you are correct.
Admitting that, it still can be useful in that you can detect times and processes and loads occurring. Then if the update/access times are maintained in the FS (not everyone enables access times) you have a good starting point to "finger" the problem processes, users and files.
Of course, I don't have to do that anymore. :-)) So I can make it sound a lot easier than it really is. But I do recall that tracking these source of these sorts of problems was a (sometimes) tedious and convoluted task. Back then we didn't have a lot of tools other than SAR.
It still can be useful if no one bumps into a good tool for the OP to try. But, depending on the complexity and volumes of users, files, ... it might be fairly quick or very tedious.
Of course, a very tight loop of "lsof" might provide what is really needed, but I have no idea of the load that might place on the system, the volume of output that may be generated, ...
Presuming a large site, SAR _may_ provide a starting point allowing a substantial reduction in the quantity of output the OP may need to examine or may reduce potential load by allowing targeting of specific times, users, processes or whatnot.
That's the great thing about debugging with incomplete information: everything is possible and fun! :-)
Kai
Matt wrote:
Is there anyway to tell with top or iostat which process is hogging all the disk I/O?
If it is hogging "all" of the disk I/O the "state" of the process when shown in top will frequently be "D" (others will usually be "S" or maybe "R")
Same goes for viewing the process using ps.
nate
Matt wrote:
Is there anyway to tell with top or iostat which process is hogging all the disk I/O?
No, you need systemtap for this.
http://sourceware.org/systemtap/wiki/ScriptsTools has examples.
Remember, you'd also need the corresponding kernel-debuginfo package for your running kernel which you can get from http://debuginfo.centos.org/.
Cheers,
Ralph