I have a Centos 5.4 machine that has, for the past two weeks, apparently been shut off over the weekend. It's just sitting there turned off on Monday morning and when someone hits the power switch it comes right back on and everything works again.
This happened last weekend, and again over this past weekend.
Here is /var/log/messages from shortly before it apparently shut down this weekend. Can anyone tell me what is or might be going on?
Apr 10 11:01:27 answeringmachine vgetty[6012]: message keep, length=00:00:20, name='', caller=none, dev=ttyACM0, pid=6012 Apr 10 11:19:10 answeringmachine vgetty[6095]: message keep, length=00:00:24, name='', caller=none, dev=ttyACM0, pid=6095 Apr 10 11:43:56 answeringmachine vgetty[6134]: message keep, length=00:00:11, name='', caller=none, dev=ttyACM0, pid=6134 Apr 10 12:22:34 answeringmachine mgetty[6029]: fax dev=ttyS1, pid=6029, caller='none', name='', id='', +FHNG=074, pages=0/0, time=00:00:51 Apr 10 12:43:49 answeringmachine gconfd (freeads-3524): Exiting Apr 10 12:43:49 answeringmachine gdm[2872]: Master halting... Apr 10 12:43:49 answeringmachine pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found Apr 10 12:43:50 answeringmachine shutdown[2872]: shutting down for system halt Apr 10 12:43:50 answeringmachine mgetty[6262]: failed dev=ttyS1, pid=6262, got signal 15, exiting Apr 10 12:43:50 answeringmachine mgetty[2874]: failed dev=ttyS2, pid=2874, got signal 15, exiting Apr 10 12:43:50 answeringmachine vgetty[2877]: failed dev=ttyACM1, pid=2877, got signal 15, exiting Apr 10 12:43:50 answeringmachine vgetty[6183]: failed dev=ttyACM0, pid=6183, got signal 15, exiting Apr 10 12:43:51 answeringmachine smartd[2854]: smartd received signal 15: Terminated Apr 10 12:43:51 answeringmachine smartd[2854]: smartd is exiting (exit status 0) Apr 10 12:43:52 answeringmachine avahi-daemon[2769]: Got SIGTERM, quitting. Apr 10 12:43:52 answeringmachine avahi-daemon[2769]: Leaving mDNS multicast group on interface eth0.IPv6 with address fe80::21c:c0ff:fee3:f1b1. Apr 10 12:43:52 answeringmachine avahi-daemon[2769]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.6. Apr 10 12:43:52 answeringmachine mountd[2588]: Caught signal 15, un-registering and exiting. Apr 10 12:43:52 answeringmachine kernel: nfsd: last server has exited Apr 10 12:43:52 answeringmachine kernel: nfsd: unexporting all filesystems Apr 10 12:43:57 answeringmachine ntpd[2532]: ntpd exiting on signal 15 Apr 10 12:43:57 answeringmachine nm-system-settings: disconnected from the system bus, exiting. Apr 10 12:43:57 answeringmachine kernel: nm-system-setti[3083]: segfault at 0000000000000000 rip 0000000000000000 rsp 00007fffb3916778 error 14 Apr 10 12:43:57 answeringmachine rpc.statd[2349]: Caught signal 15, un-registering and exiting. Apr 10 12:43:57 answeringmachine auditd[2264]: The audit daemon is exiting. Apr 10 12:43:57 answeringmachine kernel: audit(1270925037.732:154): audit_pid=0 old=2264 by auid=4294967295 Apr 10 12:43:57 answeringmachine pcscd: pcscdaemon.c:572:signal_trap() Preparing for suicide Apr 10 12:43:58 answeringmachine pcscd: hotplug_libusb.c:376:HPRescanUsbBus() Hotplug stopped Apr 10 12:43:58 answeringmachine pcscd: readerfactory.c:1379:RFCleanupReaders() entering cleaning function Apr 10 12:43:58 answeringmachine pcscd: pcscdaemon.c:532:at_exit() cleaning /var/run Apr 10 12:43:59 answeringmachine kernel: Kernel logging (proc) stopped. Apr 10 12:43:59 answeringmachine kernel: Kernel log daemon terminating. Apr 10 12:44:00 answeringmachine exiting on signal 15
On Mon, Apr 12, 2010 at 1:08 PM, Frank Cox theatre@sasktel.net wrote:
I have a Centos 5.4 machine that has, for the past two weeks, apparently been shut off over the weekend. It's just sitting there turned off on Monday morning and when someone hits the power switch it comes right back on and everything works again.
This happened last weekend, and again over this past weekend.
Here is /var/log/messages from shortly before it apparently shut down this weekend. Can anyone tell me what is or might be going on?
Apr 10 12:43:59 answeringmachine kernel: Kernel logging (proc) stopped. Apr 10 12:43:59 answeringmachine kernel: Kernel log daemon terminating. Apr 10 12:44:00 answeringmachine exiting on signal 15 --
I've seen this on some HP Blades that I administer. I ended up disabling the pcsd daemon. It might have been related to a dynamically attached optical device that was moved from blade to blade, but I didn't spend much time troubleshooting. Since disabling pcsd a few months ago it has been solid.
On Mon, 2010-04-12 at 13:11 -0400, Kwan Lowe wrote:
Since disabling pcsd a few months ago it has been solid.
Interesting. Since there is no particular reason why pcsd needs to be running on this machine, I just disabled it.
I'll see what happens now.
Frank wrote:
I have a Centos 5.4 machine that has, for the past two weeks, apparently been shut off over the weekend. It's just sitting there turned off on Monday morning and when someone hits the power switch it comes right back on and everything works again.
This happened last weekend, and again over this past weekend.
Here is /var/log/messages from shortly before it apparently shut down this weekend. Can anyone tell me what is or might be going on?
Apr 10 11:01:27 answeringmachine vgetty[6012]: message keep, length=00:00:20, name='', caller=none, dev=ttyACM0, pid=6012 Apr 10 11:19:10 answeringmachine vgetty[6095]: message keep, length=00:00:24, name='', caller=none, dev=ttyACM0, pid=6095 Apr 10 11:43:56 answeringmachine vgetty[6134]: message keep, length=00:00:11, name='', caller=none, dev=ttyACM0, pid=6134 Apr 10 12:22:34 answeringmachine mgetty[6029]: fax dev=ttyS1, pid=6029, caller='none', name='', id='', +FHNG=074, pages=0/0, time=00:00:51 Apr 10 12:43:49 answeringmachine gconfd (freeads-3524): Exiting
This - I wonder if someone's somehow getting in. Anything in /var/log/secure for 12:43?
Apr 10 12:43:49 answeringmachine gdm[2872]: Master halting... Apr 10 12:43:49 answeringmachine pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found Apr 10 12:43:50 answeringmachine shutdown[2872]: shutting down for system halt
Here's a thought: does it have bluetooth enabled? Is it in range to talk to someone *else's* bluetooth keyboard? Is anyone near enough to shut down their machine, and/or accidentally yours? <snip>
status 0) Apr 10 12:43:52 answeringmachine avahi-daemon[2769]: Got SIGTERM, quitting.
Is this machine hardwired? If so, you do NOT need the avahi-daemon, on by default, which is intended for a clueless home user to set up a network. Turn it *off*, and yank the firewall rule that allows it. <snip>
mark
On Mon, 2010-04-12 at 13:19 -0400, m.roth@5-cent.us wrote:
This - I wonder if someone's somehow getting in. Anything in /var/log/secure for 12:43?
Apr 9 18:13:46 answeringmachine gdm[2965]: pam_unix(gdm:session): session opened for user freeads by (uid=0) Apr 10 12:43:49 answeringmachine gdm[2965]: pam_unix(gdm:session): session closed for user freeads Apr 10 12:43:53 answeringmachine sshd[2516]: Received signal 15; terminating. Apr 10 12:43:53 answeringmachine runuser: pam_unix(runuser:session): session opened for user frankcox by (uid=0) Apr 10 12:43:53 answeringmachine runuser: pam_unix(runuser:session): session closed for user frankcox Apr 12 09:53:49 answeringmachine sshd[2518]: Server listening on :: port 22.
Here's a thought: does it have bluetooth enabled? Is it in range to talk to someone *else's* bluetooth keyboard? Is anyone near enough to shut down their machine, and/or accidentally yours?
There is no bluetooth or wireless anything on this, other than a wireless mouse. Everything else is hard-wired.
Is this machine hardwired? If so, you do NOT need the avahi-daemon, on by default, which is intended for a clueless home user to set up a network. Turn it *off*, and yank the firewall rule that allows it.
Interesting. I never realized (or thought about) that. I shall do a bunch of turning-off of avahi-daemon on these machines.
Frank wrote:
On Mon, 2010-04-12 at 13:19 -0400, m.roth@5-cent.us wrote:
This - I wonder if someone's somehow getting in. Anything in /var/log/secure for 12:43?
Apr 9 18:13:46 answeringmachine gdm[2965]: pam_unix(gdm:session): session opened for user freeads by (uid=0) Apr 10 12:43:49 answeringmachine gdm[2965]: pam_unix(gdm:session): session closed for user freeads Apr 10 12:43:53 answeringmachine sshd[2516]: Received signal 15; terminating. Apr 10 12:43:53 answeringmachine runuser: pam_unix(runuser:session): session opened for user frankcox by (uid=0) Apr 10 12:43:53 answeringmachine runuser: pam_unix(runuser:session): session closed for user frankcox
*This, I don't understand. Either freeads or frankcox shut the system down, it looks like to me. Could your account have been compromised?
Apr 12 09:53:49 answeringmachine sshd[2518]: Server listening on :: port 22.
Here's a thought: does it have bluetooth enabled? Is it in range to talk to someone *else's* bluetooth keyboard? Is anyone near enough to shut down their machine, and/or accidentally yours?
There is no bluetooth or wireless anything on this, other than a wireless mouse. Everything else is hard-wired.
Is this machine hardwired? If so, you do NOT need the avahi-daemon, on by default, which is intended for a clueless home user to set up a network. Turn it *off*, and yank the firewall rule that allows it.
Interesting. I never realized (or thought about) that. I shall do a bunch of turning-off of avahi-daemon on these machines.
Yes. Turn them *all* off. Actually, in a server room, you could having other, obscure problems, as wireless tries to connect.
mark
Is the machine connected to an UPS?
Maybe the UPS either sends a shutdown signal, or maybe it just breaks the powerfeed periodically because an old battery / overload (ours did once), and the machine cannot restart itself after that.
- Jussi Hirvi
On Mon, 2010-04-12 at 20:56 +0300, Jussi Hirvi wrote:
Is the machine connected to an UPS?
Maybe the UPS either sends a shutdown signal, or maybe it just breaks the powerfeed periodically because an old battery / overload (ours did once), and the machine cannot restart itself after that.
That was, in fact, my first thought. I changed that UPS just a couple of weeks ago; it's brand new. I also don't have that UPS hooked up to the computer with the data cable, so if it did drop the power it shouldn't go thorough the shutdown sequence but would immediately drop dead.
On Mon, 2010-04-12 at 13:43 -0400, m.roth@5-cent.us wrote:
*This, I don't understand. Either freeads or frankcox shut the system down, it looks like to me. Could your account have been compromised?
I'm starting to wonder about freeads, actually. That account is generally left logged-in from the main office and I wonder who might have been in the office on Saturday. I'm going to do some investigation.
frankcox is my vnc session on that machine.
That machine doesn't talk to the outside world directly; you have to go through another one first. And the "other one" is acting fine.
On Mon, 2010-04-12 at 13:43 -0400, m.roth@5-cent.us wrote:
*This, I don't understand. Either freeads or frankcox shut the system down, it looks like to me. Could your account have been compromised?
I'm starting to wonder about freeads, actually. That account is generally left logged-in from the main office and I wonder who might have been in the office on Saturday. I'm going to do some investigation.
Does the machine that that account is logged on from fire up a screen-locking screen saver after some reasonable amount of inactivity (like 20 min or so?)
frankcox is my vnc session on that machine.
Guessed that.
That machine doesn't talk to the outside world directly; you have to go through another one first. And the "other one" is acting fine.
So something is happening on this machine.
mark
On Mon, 2010-04-12 at 14:46 -0400, m.roth@5-cent.us wrote:
Does the machine that that account is logged on from fire up a screen-locking screen saver after some reasonable amount of inactivity (like 20 min or so?)
No, it just sits there doing nothing forever until someone sits down in front of it.
So something is happening on this machine.
Apparently. I have just been looking at the log from the previous weekend when it shut down before.
Here is what /var/log/messages has to say from April 2. Once again the pscd line is right at the top of the log immediately after the shutdown signal. I don't know if that's suggestive, or just a coincidence.
Apr 2 09:12:43 answeringmachine vgetty[22949]: message keep, length=00:01:02, name='', caller=none, dev=ttyACM0, pid=22949 Apr 2 09:23:20 answeringmachine vgetty[22963]: message keep, length=00:00:25, name='', caller=none, dev=ttyACM0, pid=22963 Apr 2 09:43:56 answeringmachine gdm[2880]: Master halting... Apr 2 09:43:56 answeringmachine shutdown[2880]: shutting down for system halt Apr 2 09:43:57 answeringmachine pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found Apr 2 09:43:57 answeringmachine mgetty[2884]: failed dev=ttyS2, pid=2884, got signal 15, exiting
and so on through the rest of the shutdown.
On Mon, 2010-04-12 at 14:46 -0400, m.roth@5-cent.us wrote:
Does the machine that that account is logged on from fire up a screen-locking screen saver after some reasonable amount of inactivity (like 20 min or so?)
No, it just sits there doing nothing forever until someone sits down in front of it.
That shouldn't be left that way. Those who use it should have their own accounts.
Hell, I know I'm paranoid, but a sysadmin, as a friend (also a sysadmin) likes to say, he's being paid to be professionally paranoid. I log off my own system, at home, every night, and every morning before I head off to work.
So something is happening on this machine.
Apparently. I have just been looking at the log from the previous weekend when it shut down before.
Here is what /var/log/messages has to say from April 2. Once again the pscd line is right at the top of the log immediately after the shutdown signal. I don't know if that's suggestive, or just a coincidence.
I think the latter, unless you've got SmartCards that you use for security. (Yes, some here have them; they haven't given me one yet, but it's coming - oh, yes, I'm a contractor with the US federal gov't.)
Apr 2 09:12:43 answeringmachine vgetty[22949]: message keep, length=00:01:02, name='', caller=none, dev=ttyACM0, pid=22949 Apr 2 09:23:20 answeringmachine vgetty[22963]: message keep, length=00:00:25, name='', caller=none, dev=ttyACM0, pid=22963 Apr 2 09:43:56 answeringmachine gdm[2880]: Master halting... Apr 2 09:43:56 answeringmachine shutdown[2880]: shutting down for system halt
Something's happening at 09:43:56. Don't suppose there's any other logs with that timestamp? <snip> mark