[CentOS] CentOS 5.8 crash/freeze running VMware

Tue Jul 10 15:28:26 UTC 2012
Michael Eager <eager at eagerm.com>

On 07/06/2012 11:17 AM, Johnny Hughes wrote:
> On 06/29/2012 09:52 AM, Michael Eager wrote:
>> On 06/28/2012 06:33 PM, Ted Miller wrote:
>>> On 06/28/2012 12:45 PM, Michael Eager wrote:
>>>> Hi --
>>>>
>>>> I have a server running CentOS 5.8.  It has a 6-core AMD processor,
>>>> 16Gb memory, and a RAID 5 file system.  It serves as both a file server
>>>> and to run several VMware virtual machines.  The guest machines run
>>>> Windows 7 and various versions of Linux.
>>>>
>>>> The system is running the latest version of VMware Workstation.
>>>> Until recently, I started VMs using the VMware Workstation GUI.
>>>> The system has been very stable and seldom crashes.
>>>>
>>>> Recently, I set up an init script to start several VMs at boot
>>>> time using the vmrun command.  This appeared to work correctly,
>>>> but the system has become unstable, freezing at various times.
>>>> When the system freezes, there is no console response and it
>>>> does not respond to a ping.  There is nothing in syslog to
>>>> indicate any error.
>>>>
>>>> The script started 8 VMs.  I've cut back to now running 4 VMs
>>>> and the system appears stable.
>>>>
>>>> Is there some relation between the number of cores and the number
>>>> of VMs one can run?
>>>>
>>>> Is there something else which might cause the system to crash
>>>> when running multiple VMs?
>>>>
>>>> Any suggestions to identify why the system crashed?
>>>>
>>> Are you staggering the startups of the VMs?  The server may be choking
>>> trying to boot 8 machines at once.  I suggest starting a VM every 30-60
>>> seconds, so that you aren't trying to boot all 8 at once.  Don't know if it
>>> will help, but it might.
>> The crashs happen long after boot time when all of the VMs are running.
>>
>> (Actually, startup goes very smoothly, with the VMs starting in parallel
>> in the background while system boot completes.)
>
> This sounds like the issue with the machine running out of memory and
> the Out of Memory killer actually killing one of the VMWare instances.
>
> My experience with this on a very good machine was that there was enough
> memory, but it was timing that was causing the issue.  The machine did
> not respond quickly enough to the memory request and the OOM Killer then
> acted.
>
> How I solved my problem was to reserve more memory as unused with this
> memory variable:
>
> I have had issues with VMWare host server and running out of memory,
> maybe try setting this variable in sysctl.conf:
>
> vm.min_free_kbytes=65536
>
> (that will maintain 64MB of free RAM and should allow for enough time to
> prevent OOM kills)

I'll give that a try.

But the problem was not that one or more VMware instances was killed and
other processes continued, but that the system hung.  Nothing was running.

-- 
Michael Eager	 eager at eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077