[CentOS] serious problem with torque

Wed May 27 18:01:57 UTC 2015
m.roth at 5-cent.us <m.roth at 5-cent.us>

Valeri Galtsev wrote:
>
> On Wed, May 27, 2015 10:55 am, Zachary Giles wrote:
>> Mark, You might really want to compile torque from source (into an RPM
>> if you'd like) and redistribute that. Every version is a little wonky
>> and those of us that use(d) it often will poke around until we find a
>> version / patch-set that makes us happy and stick with that for a bit.
>> It's not an exact science and newer / higher versions are not always
>> better.
>
> My experience exactly. We used version 2 for quite a while. Never managed
> to upgrade to version 3 (tried a few times, but didn't invest much of
> effort). Then we went directly to version 4. Starting trqauthd was the
> most notable difference. We never use rpms, we just compile torque on
> master and compute nodes. Compilation is always so straightforward, and
> never failed, so we didn't bother to package it...

No. Not going to compile unless there's *no* other way. We've got...five?
six? clusters or systems using torque. We've also got over 170
workstations and servers, most are getting up there, and there's me, the
other admin, and our manager, who's at another Institute most of the time.
Frequently, I feel like a one-armed paperhanger. We almost *never* do
something like that; instead we build our own rpm. And doing that can
range from one package that the folks knew what they were doing, and took
a couple hours, to the horror of bioperl, which, on and off, took
something over a month. *shudder* I haven't had to update that, happily.

What disturbs me most is going up *two* releases, and all *within* one
version of the o/s. Upstream release an update to the umich package that
jumped a full release or two, and several of our senior researchers were
dead in the water, till we figured out what happened.

That ain't my idea of "enterprise".
>>
>> As for the downgrade comment: Perhaps you can't, but, Torque, when
>> it's down, doesn't really hold any state besides the configuration
>> (queues and such), so you should be able to extract that, completely
>> uninstall torque, and reinstall whatever version you want. If 2.x
>> works for you, grab the latest from source, build it, reinstall and
>> throw the config back in.

My manager managed to downgrade - we've got a local mirror, and there's
backups.
>>
>> Hope this helps a little.
>> -Zach
>> (I don't read often, so I might go AWOL)

Thanks.

      mark