On Sun, Apr 23, 2017 at 11:22:46AM -0700, Jim Perrin wrote:
So I believe one of the goals that Ondrej and others have in bringing this SIG up are to work on providing a common HPC baseline, initially by delivering things similar to (and expanding on) OpenHPC. Since CentOS is an rpm based distribution, the 'package management' way to easily install the required software is via yum/rpm. As was mentioned by the other reply, multiple versions are often needed/required but this could in theory be handled with SCLs.
HPC covers a lot of grounds, imho (cf beowulf mainling list, from research group size to national/multi countries setup). - compute part: building the software (tuned for your cpu/gpu, ie openblas/atlas VS generic), SCL, Lmod/modules - easybuild/spark/nix/... - hardware (IB, dedicated hw such as FPGA, ...), ARM VS x86_64, ... - management (puppet/ansible/salt/...) - scaling on 10s on nodes VS 1000 VS more... (network/rack/datacenter management at scale) - user management (from plain /etc/{password|shadow} to FreeIPA, or Active Directory...) - shared storage, NFSv3/v4, pNFS, proprietary (cf panasas, gpfs,...) - and managing 100 TB or 100 PB is not the same (cf robinhood.sf.net) - distributed storage (client/server): tuned for different workload and requirements (quotas, ACLs, streaming VS IOPS, locking?, cheap?, expandability) lustre, beegfs, rozofs, moosefs, ..., ceph, glusterfs, - archiving/long term storage (irods?) - batch queuing: slurm and friends - containers (docker, singularity, ...) - web interfaces for non IT fluent users - remote visualisation (to avoid moving TB of data) - UEFI vs plain PXE/legacy booting - cloud expansion or cloud based for embarrassingly parallel workload ? - haddoop ? - what framework? warewulf as in openhpc, xcat, ks (foreman or DYI), ...
Cheers
Tru