[CentOS] Commands failing silently?

Tue Mar 25 18:21:54 UTC 2008
Dan Bongert <dbongert at wisc.edu>

William L. Maltby wrote:
> On Mon, 2008-03-24 at 16:19 -0500, Dan Bongert wrote:
>> mouss wrote:
>>> Dan Bongert wrote:
>>>> Hello all:
>>>>
>>>> <snip>
> 
> 
>> Though 'ls' was just an example -- just about any program will fail. The 'w'
>> command will fail too:
>>
>> thoth(118) /tmp> w
>>    16:06:51 up  5:34,  1 user,  load average: 0.94, 1.46, 2.04
>> USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
>> dbongert pts/0    copland.ssc.wisc 14:16    0.00s  0.22s  0.05s w
>>
>> thoth(119) /tmp> w
>>    16:06:52 up  5:34,  1 user,  load average: 0.94, 1.46, 2.04
>> USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
>> dbongert pts/0    copland.ssc.wisc 14:16    0.00s  0.22s  0.05s w
>>
>> thoth(120) /tmp> w
>>
>> thoth(121) /tmp> w
>>
> 
> Hmmm... Sure it's failing? Maybe just the output is going somewhere
> else? After the command runs, what does "echo $?" show? Does it even
> work? Echo is a bash internal command, so I would expect it to never
> fail.

Ok, it's definitely getting an error from somewhere:

thoth(3) /tmp> ls

thoth(4) /tmp> echo $?
141

Although:

thoth(31) ~> top


thoth(32) ~> echo $?
0

> What is your output device? A serial terminal? If so, could be simple
> flow control issues. In fact, any serial connection (even a PC emulating
> a terminal) could suffer from flow control problems. And they would tend
> to be erratic in nature.

I'm usually sshing into the machine, but I've also experienced the problem
on the console.

> If you are on a normal console, try running the commands similart to
> this (trying to determine if *something* else is receiving output or
> not)
> 
>     <your command> &> /dev/tty
> 
> if this works reliably, maybe that's a starting point.

Nope, that fails intermittently as well.

> There's a couple kernel guys who frequent this list. Maybe one of them
> will have a clue as to what could go wrong. Corrupted libraries and
> whatnot.
> 
> You might try that rpm -V command earlier against all packages (add a
> "a" IIRC). Maybe some library accessed by the coreutils, but which is
> not itself part of coreutils, is corrupt.

Hmm....when I do a 'rpm -Va', I get lots of "at least one of file's
dependencies has changed since prelinking" errors. Even if I run prelink
manually, and then do a 'rpm -Va' immediately afterwards.
-- 
Dan Bongert                     dbongert at wisc.edu