[CentOS] serious problem with torque

Wed May 27 14:46:55 UTC 2015
m.roth at 5-cent.us <m.roth at 5-cent.us>

Johnny Hughes wrote:
> On 05/27/2015 09:07 AM, m.roth at 5-cent.us wrote:
>> Hi, folks,
>>
>>    The other admin updated torque without testing it on one machine, and
>> we had Issues. The first I knew was when a user reported qstat
>> returning
>> socket_connect_unix failed: 15137
>> socket_connect_unix failed: 15137
>> socket_connect_unix failed: 15137
>> qstat: cannot connect to server (null) (errno=15137) could not connect
>> to trqauthd
>>
>> Attempting to restart the pbs_server did the same. Working with my
>> manager, we found:
>>   a) torque had been updated from 2.x to 4.2.10, which is huge.
>>   b) Apparently, it no longer uses munged. Instead, it uses trqauthd,
>> and
>> that wasn't
>>         in the updated packages.
>>   c) We could not downgrade!!!
>>   d) My manager updated from testing, and installed, and then running
>> trqauthd, and
>>         restarting pbs_server, it appears to be working again.
>>
>> Should I be filing a bug report?
>
> You don not mention which version of CentOS you are using, but for
> CentOS-7 ..

Sorry, it's 6.6.
>
> The only torque I see is in epel-testing (which is their unstable
> branch) .. I would think that is the list for this discussion.  Or did
> it come from somewhere else?
>
> Not that I mind it being discussed here too .. but you might get better
> results there.

Thanks, Johnny. I *just* posted an apology, that I realized it was an EPEL
issue.... Talk about an "upgrade disaster"! I think the other admin - he's
been here less than a year, is coming to understand why I'm paranoid about
some updates, and why we roll out some things stepwise, testing it
first.... I see he updated firefox & t-bird; I'm guessing that the most
current fixes the updates that broke language, etc, a week or two ago.

       mark