SIG Proposal - High performance computing (HPC) SIG - devel

List overview All Threads
Download

newer

SIG Proposal - High performance computing (HPC) SIG

older

We need a newer Vagrant in SCL

GA Release: CentOS Community...

Ondřej Vašík

23 Apr 2017 23 Apr '17

12:09 p.m.

Hi,

I would like to propose start of High performance computing (HPC) SIG. I see it already mentioned on https://wiki.centos.org/SpecialInterestGroup among Future SIGs. Primary reason for the SIG existence will be to improve the state of High performance computing related packages on CentOS and similar distributions, with special focus on stability of builds, CentOS (and similar distribution) related improvements for OpenHPC project and getting new HPC packages packaged for CentOS and/or Fedora.

Initial members would be me (ovasik@redhat.com, CentOS FAS account: Reset), Adrian Reber (areber@redhat.com, CentOS FAS account: areber), Stanislav Kozina (skozina@redhat.com, CentOS FAS account: ersin) and Jan Chaloupka (jchaloup@fedoraproject.org, CentOS FAS account: jchaloup). Of course, anyone is welcome to join.

Thanks in advance for approving/sponsoring the SIG.

Regards, Ondrej Vasik

Show replies by date

Marcin Dulak

23 Apr 23 Apr

1:13 p.m.

New subject: SIG Proposal - High performance computing (HPC) SIG

On Sun, Apr 23, 2017 at 2:09 PM, Ondřej Vašík ovasik@redhat.com wrote:

...

Hi,

I would like to propose start of High performance computing (HPC) SIG. I see it already mentioned on https://wiki.centos.org/SpecialInterestGroup among Future SIGs. Primary reason for the SIG existence will be to improve the state of High performance computing related packages on CentOS and similar distributions, with special focus on stability of builds, CentOS (and similar distribution) related improvements for OpenHPC project and getting new HPC packages packaged for CentOS and/or Fedora.

Good to see an initiative to get the tools specific for HPC packaged, but I have a comment. Under https://github.com/openhpc/ohpc/tree/obs/OpenHPC_1.3_Factory/components/io-l... I see spec files for software like netcdf or hdf5.

On a cluster one needs access to **many** versions of libraries (that includes compilers, python, mpi, etc.) and packaging them as RPMS is not the correct model, unless the HPC system uses VM golden images or container images, and allows the users to start them on-demand. What is usually used is a setup based on lmod/environment-modules like https://github.com/hpcugent/easybuild-easyconfigs

I would therefore prefer the OpenHPC project focuses in the first place on the tools a single version of which is installed on the operating system.

Best regards,

Marcin

...

Initial members would be me (ovasik@redhat.com, CentOS FAS account: Reset), Adrian Reber (areber@redhat.com, CentOS FAS account: areber), Stanislav Kozina (skozina@redhat.com, CentOS FAS account: ersin) and Jan Chaloupka (jchaloup@fedoraproject.org, CentOS FAS account: jchaloup). Of course, anyone is welcome to join.

Thanks in advance for approving/sponsoring the SIG.

Regards, Ondrej Vasik

CentOS-devel mailing list CentOS-devel@centos.org https://lists.centos.org/mailman/listinfo/centos-devel

Mark Hahn

9:49 p.m.

...

On a cluster one needs access to **many** versions of libraries (that includes compilers, python, mpi, etc.) and

Well, on larger, more-shared, more-diverse clusters, that's true. But there are plenty of clusters that don't customize the stack much, if at all, even if they use locally-developed codes.

Really, the point is: the very nature of a distribution is that it reduces flexibility in favor of convenience. If "cloud is someone else's computer", then "distribution is someone else's build/test/packaging". I think it's still useful to have a baseline HPC cluster distro, even if many people, especially at larger sites, resort to modules to produce other combinations of middleware.

...

packaging them as RPMS is not the correct model, unless the HPC system uses VM golden images or container images, and allows the users to start them

Many, many clusters do use RPMS, whether that means NFSroot approaches, or stateful node installs (often just kickstart, though there are many who use devops approaches like puppet).

One sticky aspect of the modules approach is illustrated by Nix: either you use it sparingly, or you go all the way and replace everything about the node install (all the way down to ld.so and glibc...)

regards, mark hahn (sharcnet/computecanada)

Ondřej Vašík

24 Apr 24 Apr

6:44 a.m.

Marcin Dulak píše v Ne 23. 04. 2017 v 15:13 +0200:

...

On Sun, Apr 23, 2017 at 2:09 PM, Ondřej Vašík ovasik@redhat.com wrote: Hi,
    I would like to propose start of High performance computing
    (HPC) SIG. I
    see it already mentioned on
    https://wiki.centos.org/SpecialInterestGroup
    among Future SIGs. Primary reason for the SIG existence will
    be to
    improve the state of High performance computing related
    packages on
    CentOS and similar distributions, with special focus on
    stability of
    builds, CentOS (and similar distribution) related improvements
    for
    OpenHPC project and getting new HPC packages packaged for
    CentOS and/or
    Fedora.
Good to see an initiative to get the tools specific for HPC packaged, but I have a comment. Under https://github.com/openhpc/ohpc/tree/obs/OpenHPC_1.3_Factory/components/io-l... I see spec files for software like netcdf or hdf5.

On a cluster one needs access to **many** versions of libraries (that includes compilers, python, mpi, etc.) and packaging them as RPMS is not the correct model, unless the HPC system uses VM golden images or container images, and allows the users to start them on-demand. What is usually used is a setup based on lmod/environment-modules like https://github.com/hpcugent/easybuild-easyconfigs

Yes, understood, thanks for the comment. With containers being more and more popular, I think even packaging these applications and libraries makes more sense. Of course, optimizing build for the specific system would be even better, but package still gives you a way how to easily install/update/remove some application with all its dependencies.

For many versions of libs and compilers - sometimes it may make sense to create a matrix of rpms like is done in the case of openHPC initiative, sometimes probably software collections can be used to get multiple versions of library/dependency on the system in parallel.

Goal is not to solve everything - this is of course out of scope - but to improve current situation and maybe to start discussion like this - how to proceed, what is missing and what is expected to be missing (because it doesn't make sense to have it as distribution package).

Regards, Ondrej

...

I would therefore prefer the OpenHPC project focuses in the first place on the tools a single version of which is installed on the operating system.

Best regards,

Marcin

    Initial members would be me (ovasik@redhat.com, CentOS FAS
    account:
    Reset), Adrian Reber (areber@redhat.com, CentOS FAS account:
    areber),
    Stanislav Kozina (skozina@redhat.com, CentOS FAS account:
    ersin)
    and Jan Chaloupka (jchaloup@fedoraproject.org, CentOS FAS
    account:
    jchaloup). Of course, anyone is welcome to join.

    Thanks in advance for approving/sponsoring the SIG.

    Regards,
           Ondrej Vasik

    _______________________________________________
    CentOS-devel mailing list
    CentOS-devel@centos.org
    https://lists.centos.org/mailman/listinfo/centos-devel

CentOS-devel mailing list CentOS-devel@centos.org https://lists.centos.org/mailman/listinfo/centos-devel

David Hrbáč

23 Apr 23 Apr

5:45 p.m.

New subject: SIG Proposal - High performance computing (HPC) SIG

Dear Ondrej at al,

Good to read such an email. The initiative is very needed. HPC infrastructure and environment is very specific and there's no need to have HPC SW in RPM packages. I do not want to go too much into details. It would be a long email. Anyway I'm responsible for production/operations of national HPC clusters here in Ostrava. So near to Brno. I'd like to invite you all to visit us. So we can explain you the needs of HPC. How we run the services. What we need from OS. How we distribute the software for end-users within the clusters.

Looking forward to hearing from you soon. DH

2017-04-23 14:09 GMT+02:00 Ondřej Vašík ovasik@redhat.com:

...

Hi,

I would like to propose start of High performance computing (HPC) SIG. I see it already mentioned on https://wiki.centos.org/SpecialInterestGroup among Future SIGs. Primary reason for the SIG existence will be to improve the state of High performance computing related packages on CentOS and similar distributions, with special focus on stability of builds, CentOS (and similar distribution) related improvements for OpenHPC project and getting new HPC packages packaged for CentOS and/or Fedora.

Initial members would be me (ovasik@redhat.com, CentOS FAS account: Reset), Adrian Reber (areber@redhat.com, CentOS FAS account: areber), Stanislav Kozina (skozina@redhat.com, CentOS FAS account: ersin) and Jan Chaloupka (jchaloup@fedoraproject.org, CentOS FAS account: jchaloup). Of course, anyone is welcome to join.

Thanks in advance for approving/sponsoring the SIG.

Regards, Ondrej Vasik

CentOS-devel mailing list CentOS-devel@centos.org https://lists.centos.org/mailman/listinfo/centos-devel

Jim Perrin

6:22 p.m.

So I believe one of the goals that Ondrej and others have in bringing this SIG up are to work on providing a common HPC baseline, initially by delivering things similar to (and expanding on) OpenHPC. Since CentOS is an rpm based distribution, the 'package management' way to easily install the required software is via yum/rpm. As was mentioned by the other reply, multiple versions are often needed/required but this could in theory be handled with SCLs.

On 04/23/2017 10:45 AM, David Hrbáč wrote:

...

Dear Ondrej at al,

Good to read such an email. The initiative is very needed. HPC infrastructure and environment is very specific and there's no need to have HPC SW in RPM packages. I do not want to go too much into details. It would be a long email. Anyway I'm responsible for production/operations of national HPC clusters here in Ostrava. So near to Brno. I'd like to invite you all to visit us. So we can explain you the needs of HPC. How we run the services. What we need from OS. How we distribute the software for end-users within the clusters.

Looking forward to hearing from you soon. DH

2017-04-23 14:09 GMT+02:00 Ondřej Vašík <ovasik@redhat.com mailto:ovasik@redhat.com>:
Hi,

I would like to propose start of High performance computing (HPC) SIG. I
see it already mentioned on
https://wiki.centos.org/SpecialInterestGroup
<https://wiki.centos.org/SpecialInterestGroup>
among Future SIGs. Primary reason for the SIG existence will be to
improve the state of High performance computing related packages on
CentOS and similar distributions, with special focus on stability of
builds, CentOS (and similar distribution) related improvements for
OpenHPC project and getting new HPC packages packaged for CentOS and/or
Fedora.

Initial members would be me (ovasik@redhat.com
<mailto:ovasik@redhat.com>, CentOS FAS account:
Reset), Adrian Reber (areber@redhat.com <mailto:areber@redhat.com>,
CentOS FAS account: areber),
Stanislav Kozina (skozina@redhat.com <mailto:skozina@redhat.com>,
CentOS FAS account: ersin)
and Jan Chaloupka (jchaloup@fedoraproject.org
<mailto:jchaloup@fedoraproject.org>, CentOS FAS account:
jchaloup). Of course, anyone is welcome to join.

Thanks in advance for approving/sponsoring the SIG.

Regards,
       Ondrej Vasik

_______________________________________________
CentOS-devel mailing list
CentOS-devel@centos.org <mailto:CentOS-devel@centos.org>
https://lists.centos.org/mailman/listinfo/centos-devel
<https://lists.centos.org/mailman/listinfo/centos-devel>
CentOS-devel mailing list CentOS-devel@centos.org https://lists.centos.org/mailman/listinfo/centos-devel

-- Jim Perrin The CentOS Project | http://www.centos.org twitter: @BitIntegrity | GPG Key: FA09AD77

Tru Huynh

24 Apr 24 Apr

7:27 a.m.

On Sun, Apr 23, 2017 at 11:22:46AM -0700, Jim Perrin wrote:

...

So I believe one of the goals that Ondrej and others have in bringing this SIG up are to work on providing a common HPC baseline, initially by delivering things similar to (and expanding on) OpenHPC. Since CentOS is an rpm based distribution, the 'package management' way to easily install the required software is via yum/rpm. As was mentioned by the other reply, multiple versions are often needed/required but this could in theory be handled with SCLs.

HPC covers a lot of grounds, imho (cf beowulf mainling list, from research group size to national/multi countries setup). - compute part: building the software (tuned for your cpu/gpu, ie openblas/atlas VS generic), SCL, Lmod/modules - easybuild/spark/nix/... - hardware (IB, dedicated hw such as FPGA, ...), ARM VS x86_64, ... - management (puppet/ansible/salt/...) - scaling on 10s on nodes VS 1000 VS more... (network/rack/datacenter management at scale) - user management (from plain /etc/{password|shadow} to FreeIPA, or Active Directory...) - shared storage, NFSv3/v4, pNFS, proprietary (cf panasas, gpfs,...) - and managing 100 TB or 100 PB is not the same (cf robinhood.sf.net) - distributed storage (client/server): tuned for different workload and requirements (quotas, ACLs, streaming VS IOPS, locking?, cheap?, expandability) lustre, beegfs, rozofs, moosefs, ..., ceph, glusterfs, - archiving/long term storage (irods?) - batch queuing: slurm and friends - containers (docker, singularity, ...) - web interfaces for non IT fluent users - remote visualisation (to avoid moving TB of data) - UEFI vs plain PXE/legacy booting - cloud expansion or cloud based for embarrassingly parallel workload ? - haddoop ? - what framework? warewulf as in openhpc, xcat, ks (foreman or DYI), ...

Cheers

Tru

-- Tru Huynh http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xBEFA581B

David Hrbáč

8:14 a.m.

New subject: SIG Proposal - High performance computing (HPC) SIG

Tru +1

2017-04-24 9:27 GMT+02:00 Tru Huynh tru@centos.org:

...

HPC covers a lot of grounds, imho (cf beowulf mainling list, from research group size to national/multi countries setup).

compute part: building the software (tuned for your cpu/gpu, ie

openblas/atlas VS generic), SCL, Lmod/modules

easybuild/spark/nix/...

hardware (IB, dedicated hw such as FPGA, ...), ARM VS x86_64, ...

management (puppet/ansible/salt/...)

scaling on 10s on nodes VS 1000 VS more... (network/rack/datacenter

management at scale)

user management (from plain /etc/{password|shadow} to FreeIPA, or Active

Directory...)

shared storage, NFSv3/v4, pNFS, proprietary (cf panasas, gpfs,...)

and managing 100 TB or 100 PB is not the same (cf robinhood.sf.net)

distributed storage (client/server): tuned for different workload and

requirements (quotas, ACLs, streaming VS IOPS, locking?, cheap?, expandability) lustre, beegfs, rozofs, moosefs, ..., ceph, glusterfs,

archiving/long term storage (irods?)

batch queuing: slurm and friends

containers (docker, singularity, ...)

web interfaces for non IT fluent users

remote visualisation (to avoid moving TB of data)

UEFI vs plain PXE/legacy booting

cloud expansion or cloud based for embarrassingly parallel workload ?

haddoop ?

what framework? warewulf as in openhpc, xcat, ks (foreman or DYI), ...

Cheers

Tru

David Hrbáč

1 May 1 May

8:17 p.m.

New subject: SIG Proposal - High performance computing (HPC) SIG

Hi to all,

We had a TELCO on 20170426. Here are brief NoM.

- We will provide the credentials for members of SIG to have access to real HPC system. - We hope to meet in a person very, so we can show infra and discuss needs and workflows in HPC domain. - SIG members have been provided with project ID at IT4Innovations and an email describing how to apply for the credentials. - We hope to have another TELCO, date not set yet.

Regards, DH

2017-04-24 10:14 GMT+02:00 David Hrbáč david-lists@hrbac.cz:

...

Tru +1

DH

2017-04-24 9:27 GMT+02:00 Tru Huynh tru@centos.org:

...
HPC covers a lot of grounds, imho (cf beowulf mainling list, from research group size to national/multi countries setup).

compute part: building the software (tuned for your cpu/gpu, ie

openblas/atlas VS generic), SCL, Lmod/modules

easybuild/spark/nix/...

hardware (IB, dedicated hw such as FPGA, ...), ARM VS x86_64, ...

management (puppet/ansible/salt/...)

scaling on 10s on nodes VS 1000 VS more... (network/rack/datacenter

management at scale)

user management (from plain /etc/{password|shadow} to FreeIPA, or

Active Directory...)

shared storage, NFSv3/v4, pNFS, proprietary (cf panasas, gpfs,...)

and managing 100 TB or 100 PB is not the same (cf robinhood.sf.net)

distributed storage (client/server): tuned for different workload and

requirements (quotas, ACLs, streaming VS IOPS, locking?, cheap?, expandability) lustre, beegfs, rozofs, moosefs, ..., ceph, glusterfs,

archiving/long term storage (irods?)

batch queuing: slurm and friends

containers (docker, singularity, ...)

web interfaces for non IT fluent users

remote visualisation (to avoid moving TB of data)

UEFI vs plain PXE/legacy booting

cloud expansion or cloud based for embarrassingly parallel workload ?

haddoop ?

what framework? warewulf as in openhpc, xcat, ks (foreman or DYI), ...

Cheers

Tru

Adrian Reber

24 Apr 24 Apr

4:44 a.m.

On Sun, Apr 23, 2017 at 02:09:55PM +0200, Ondřej Vašík wrote:

...

I would like to propose start of High performance computing (HPC) SIG. I see it already mentioned on https://wiki.centos.org/SpecialInterestGroup among Future SIGs. Primary reason for the SIG existence will be to improve the state of High performance computing related packages on CentOS and similar distributions, with special focus on stability of builds, CentOS (and similar distribution) related improvements for OpenHPC project and getting new HPC packages packaged for CentOS and/or Fedora.

Initial members would be me (ovasik at redhat.com, CentOS FAS account: Reset), Adrian Reber (areber at redhat.com, CentOS FAS account: areber),

Almost, my CentOS FAS account is 'adrian'

Adrian

3012

Age (days ago)

3020

Last active (days ago)

devel@lists.centos.org

9 comments

7 participants

tags (0)

participants (7)

Adrian Reber
David Hrbáč
Jim Perrin
Marcin Dulak
Mark Hahn
Ondřej Vašík
Tru Huynh