On a cluster one needs access to **many** versions of libraries (that includes compilers, python, mpi, etc.) and
Well, on larger, more-shared, more-diverse clusters, that's true. But there are plenty of clusters that don't customize the stack much, if at all, even if they use locally-developed codes.
Really, the point is: the very nature of a distribution is that it reduces flexibility in favor of convenience. If "cloud is someone else's computer", then "distribution is someone else's build/test/packaging". I think it's still useful to have a baseline HPC cluster distro, even if many people, especially at larger sites, resort to modules to produce other combinations of middleware.
packaging them as RPMS is not the correct model, unless the HPC system uses VM golden images or container images, and allows the users to start them
Many, many clusters do use RPMS, whether that means NFSroot approaches, or stateful node installs (often just kickstart, though there are many who use devops approaches like puppet).
One sticky aspect of the modules approach is illustrated by Nix: either you use it sparingly, or you go all the way and replace everything about the node install (all the way down to ld.so and glibc...)
regards, mark hahn (sharcnet/computecanada)