Brian Mathis wrote: > On Tue, Mar 8, 2011 at 12:24 PM, Michael Eager <eager at eagerm.com> wrote: >> Hi -- >> >> I'm running a server which is usually stable, but every >> once in a while it hangs. The server is used as a file >> store using NFS and to run VMware machines. >> >> I don't see anything in /var/log/messages or elsewhere >> to indicate any problem or offer any clue why the system >> was hung. >> >> Any suggestions where I might look for a clue? > > Please be more specific when you say it "hangs". Does it just pause > for a minute and then continue working, or does it freeze completely > until you reboot it? Does it respond to s "soft" reboot like > Ctrl-Alt-Del, or do you need to hard power it off? System is unresponsive. Monitor blank, no response to keyboard, no response to remote ssh. Hit reset to reboot. The only indication that I had that there was a problem (other that attached systems were not accessing files) was that the fan(s) on the server were louder than normal. > Since this is an NFS server I'm going to guess there might be a lot of > IO. Maybe there is some large IO load going on, like maybe all your > VMs are running anti-virus scan at the same time, or something like > that. At the time, should be very low NFS load. > To troubleshoot, I recommend installing the 'sar' utilities (yum > install sysstat) and then reviewing the collected data using the > 'ksar' utility (http://sourceforge.net/projects/ksar/). sar/ksar are > good for tracking down acute problems. Thanks for the suggestion. I'll look into sar. -- Michael Eager eager at eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077