[CentOS-virt] vmware problem took down X on host?

Sat Jan 17 17:35:28 UTC 2009
Ted Miller <limaohio123-CompMailLists at yahoo.com>

I run VMWare server 1.07 on Centos 5.  Last night I left a Windows 2000 
virtual machine doing a ClamWin scan of drive M: when I went to bed around 
midnight.  Drive M: is actually a volume on the Centos 5 host, mounted via 
Samba.  It has about 40Gb of photos on it, plus a few other things.  I had 
the VM up visible in the VMWare Server Console running under KDE on display 
8 (X session #1), my usual configuration.

This morning when I looked in on it the screen was not responsive.  The 
keyboard seemed to be working (num lock would go on and off, though 
sometimes with some delay).  There was some disk activity showing on the 
hard drive light.  Nothing I did would cause any video to appear.  I 
finally his Ctl+Alt+1 (nothing happened) then I pressed just Ctrl, and the 
monitor light turned green.  I got a console login prompt!  When I tried to 
go back to one of the GUI consoles the computer locked up again.  I went to 
another computer on the network and logged into the VMWare server from it. 
  It showed both the Windows 2000 machine and also the IPCop VM (which had 
also been running) were both shut down.  I restarted them both 
successfully.  I again could not get any response from the host machine.

Eventually I hit a key combination that got me back to console 1 command 
line.  I went back to the other computer, shut down both of the VMs that 
were running, logged in, and did a Ctl-Alt-Del.  I breathed a sigh of 
relief as the shutdown messages started scrolling across the screen.

After the reboot, things seem pretty normal.  However, I found this in 
/var/log/messages, and wonder what they mean, especially the one at 
04:02:08 about debug info.

Jan 17 01:27:44 office2 kernel: NVRM: Xid (0005:00): 3, C 00000001 SC
  00000003 M 000002fc Data 00000003
Jan 17 01:27:44 office2 kernel: NVRM: Xid (0005:00): 3, C 00000001 SC
  00000003 M 000002fc Data 00000001
Jan 17 01:27:44 office2 last message repeated 6 times
Jan 17 03:37:35 office2 smbd[4771]: [2009/01/17 03:37:35, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:35 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone: TechLore.com_files
  is a stream name!
Jan 17 03:37:35 office2 smbd[4771]: [2009/01/17 03:37:35, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:35 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone: TechLore.com_files
  is a stream name!
Jan 17 03:37:36 office2 smbd[4771]: [2009/01/17 03:37:36, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:36 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data is a stream name!
Jan 17 03:37:36 office2 smbd[4771]: [2009/01/17 03:37:36, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:36 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data is a stream name!
Jan 17 03:37:36 office2 smbd[4771]: [2009/01/17 03:37:36, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:36 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/minimall_data is a stream name!
Jan 17 03:37:37 office2 smbd[4771]: [2009/01/17 03:37:37, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:37 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/minimall_data is a stream name!
Jan 17 03:37:41 office2 smbd[4771]: [2009/01/17 03:37:41, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:41 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_002 is a stream name!
Jan 17 03:37:41 office2 smbd[4771]: [2009/01/17 03:37:41, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:41 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_002 is a stream name!
Jan 17 03:37:42 office2 smbd[4771]: [2009/01/17 03:37:42, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:42 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_003 is a stream name!
Jan 17 03:37:42 office2 smbd[4771]: [2009/01/17 03:37:42, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:42 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_003 is a stream name!
Jan 17 03:37:42 office2 smbd[4771]: [2009/01/17 03:37:42, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:42 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_003/ads_data is a stream name!
Jan 17 03:37:42 office2 smbd[4771]: [2009/01/17 03:37:42, 0]
  smbd/open.c:open_directory(2060)
Jan 17 03:37:42 office2 smbd[4771]:   open_directory:
  TED/Alabanza/Sony_KP-43T90_TV/hv cable removal_anyone:
  TechLore.com_files/iframe_data_003/ads_data is a stream name!
Jan 17 03:56:46 office2 kernel: device eth1 left promiscuous mode
Jan 17 03:56:46 office2 kernel: audit(1232182606.512:5): dev=eth1 prom=0
  old_prom=256 auid=4294967295 ses=4294967295
Jan 17 03:56:46 office2 kernel: bridge-eth1: disabled promiscuous mode
Jan 17 03:56:46 office2 kdm[3925]: X server for display :1 terminated
  unexpectedly
Jan 17 03:56:46 office2 kernel: device eth0 left promiscuous mode
Jan 17 03:56:46 office2 kernel: audit(1232182606.897:6): dev=eth0 prom=0
  old_prom=256 auid=4294967295 ses=4294967295
Jan 17 03:56:46 office2 kernel: bridge-eth0: disabled promiscuous mode
Jan 17 03:57:05 office2 kdm_greet[14492]: Can't open default user face
Jan 17 04:02:08 office2 nmbd[3710]: [2009/01/17 04:02:08, 0]
  nmbd/nmbd.c:process(559)
Jan 17 04:02:08 office2 nmbd[3710]:   Got SIGHUP dumping debug info.
Jan 17 04:02:08 office2 nmbd[3710]: [2009/01/17 04:02:08, 0]
  nmbd/nmbd_workgroupdb.c:dump_workgroups(282)
Jan 17 04:02:08 office2 nmbd[3710]:   dump_workgroups()
Jan 17 04:02:08 office2 nmbd[3710]:    dump workgroup on subnet
       10.41.65.2: netmask=  255.255.255.0:
Jan 17 04:02:08 office2 nmbd[3710]:     MILLER(1) current master browser =
  OFFICE2
Jan 17 04:02:08 office2 nmbd[3710]:             OFFICE2 40849a03 (Office2)
Jan 17 04:15:53 office2 smbd[4771]: [2009/01/17 04:15:53, 0]
  lib/util_sock.c:read_data(534)
Jan 17 04:15:53 office2 smbd[4771]:   read_data: read failure for 4 bytes
  to client 10.41.65.251. Error = No route to host
Jan 17 04:47:01 office2 ntpd[3543]: synchronized to 69.26.112.120, stratum
  3
Jan 17 04:48:04 office2 ntpd[3543]: synchronized to LOCAL(0), stratum 10
Jan 17 10:48:08 office2 xinetd[3522]: START: vmware-authd pid=17407
  from=10.41.65.254
Jan 17 10:48:08 office2 xinetd[3522]: EXIT: vmware-authd status=0 pid=17407
  duration=0(sec)
Jan 17 10:48:44 office2 kernel: device eth1 entered promiscuous mode
Jan 17 10:48:44 office2 kernel: audit(1232207324.327:7): dev=eth1 prom=256
  old_prom=0 auid=4294967295 ses=4294967295
Jan 17 10:48:44 office2 kernel: bridge-eth1: enabled promiscuous mode
Jan 17 10:48:44 office2 kernel: device eth0 entered promiscuous mode
Jan 17 10:48:44 office2 kernel: audit(1232207324.327:8): dev=eth0 prom=256
  old_prom=0 auid=4294967295 ses=4294967295
Jan 17 10:48:44 office2 kernel: bridge-eth0: enabled promiscuous mode
Jan 17 10:48:44 office2 xinetd[3522]: START: vmware-authd pid=17425
  from=10.41.65.254

Here is the /var/log/vmware/vmware-serverd-0.log entries from that time 
period, starting back when the Win2K VM was started and running through my 
remote connection this morning (when I found all VMs powered off):

Jan 16 18:37:07: app| Adding to list of running vms:
  /vmware/Win2K-T1/Win2K-T1.vmx
Jan 16 18:37:07: app| Attempting to launch vmx :
  /vmware/Win2K-T1/Win2K-T1.vmx
Jan 16 18:37:07: app| New connection on socket server-vmxvmdb from host
  localhost (ip address: local) , user: tmiller
Jan 16 18:37:07: app| Connection from : /vmware/Win2K-T1/Win2K-T1.vmx
Jan 16 18:37:07: app| Setting up autoDetect info.
Jan 16 18:37:07: app| VMServerdConnect: connecting to
  /vmware/Win2K-T1/Win2K-T1.vmx
Jan 16 18:37:09: app| Connected to /vmware/Win2K-T1/Win2K-T1.vmx
Jan 16 18:37:09: app| Trimming event log for /vmware/Win2K-T1/Win2K-T1.vmx
  by 3 lines...
Jan 16 18:37:09: app| SP: Retrieved username: tmiller
Jan 16 18:37:09: app| VM suddenly changed state: poweredOn.
Jan 16 18:37:09: app| SP: Retrieved username: tmiller
Jan 16 18:37:09: app| SP: Retrieved username: tmiller
Jan 16 18:37:10: app| cleanup: cleaned up 2 objects
Jan 16 18:37:41: app| VmsdRegister: Config file has changed:
  /vmware/Win2K-T1/Win2K-T1.vmx
Jan 17 03:56:46: app| vmdbPipe_Streams Couldn't read: OVL_STATUS_EOF
Jan 17 03:56:46: app| VMHS: Connection to VM broken: cfg:
  /vmware/IPCop-01/IPCop-01.vmx; error: Pipe: Read failed; state: 3
Jan 17 03:56:46: app| VMServerd IPC closed the connection with thread
  /vmware/IPCop-01/IPCop-01.vmx (0x9522af0)
Jan 17 03:56:46: app| Lost connection to /vmware/IPCop-01/IPCop-01.vmx
  (/vmware/IPCop-01/IPCop-01.vmx) unexpectedly.
Jan 17 03:56:46: app| vmdbPipe_Streams Couldn't read: OVL_STATUS_EOF
Jan 17 03:56:46: app| VMHS: Connection to VM broken: cfg:
  /vmware/Win2K-T1/Win2K-T1.vmx; error: Pipe: Read failed; state: 3
Jan 17 03:56:47: app| VMServerd IPC closed the connection with thread
  /vmware/Win2K-T1/Win2K-T1.vmx (0x9522b6c)
Jan 17 03:56:47: app| Lost connection to /vmware/Win2K-T1/Win2K-T1.vmx
  (/vmware/Win2K-T1/Win2K-T1.vmx) unexpectedly.
Jan 17 03:56:47: app| OvlHostStartIo: errno 104
Jan 17 03:56:47: app| vmdbPipe_Streams: Couldn't read
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 03:56:47: app| SP: Deleting user session: 0 username: tmiller
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 03:56:47: app| VM suddenly changed state: poweredOff.
Jan 17 10:48:18: app| New connection on socket server-vmdb from host
  10.41.65.254 (ip address: 10.41.65.254) , user: tmiller

I thing the "event" happened at 03:56:46, when:
the the VMWare server lost the connection to both VMs within 2 seconds
device eth1 left promiscuous mode

I do note that it showed the connection to the IPCop machine broken first, 
so that may be where the problem started.

If it helps any, the IPCop VM uses both eth0 (LAN) and eth1 (DSL modem). 
Win2K only uses eth0.

Anyone have any insight as to what might have happened?

Any recommendations besides that I should probably upgrade the server to 
the latest version?

Ted Miller
Indiana