[CentOS] Thanks to every one

Tue Jul 18 15:13:32 UTC 2017

On Tue, 18 Jul 2017 09:01:07 -0400
Jonathan Billings <billings at negate.org> wrote:

> On Sun, Jul 16, 2017 at 06:02:15PM +0100, Pete Biggs wrote:
> > > 
> > > The physicists and mathematicians who count there need high
> > > durations.  
> > 
> > Yes. I too run HPC clusters and I have had uptimes of over 1000
> > days - clusters that are turned on when they are delivered and
> > turned off when they are obsolete. It is crucial for long running
> > calculations that you have a stable OS - you have never seen wrath
> > like a computational scientist whose 200 day calculation has just
> > failed because you needed to reboot the node it was running on.  
> 
> I too was a HPC admin, and I knew people who believed the above, and
> their clusters were compromised.  You're running a service where the
> weakest link are the researchers who use your cluster -- they're able
> to run code on your nodes, so local exploits are possible.  They often
> have poor security practices (share passwords, use them for multiple
> accounts).

I work at a quite large hpc site and fully agree.

HPC resources need possibly more smart and active security work than
your average server.

With 1000+ users that can compile and run jobs and get their
credentials misplaced etc. we typically move even faster than CentOS
updates to fix/half-patch/mitigate security vulnerabilities.

/Peter