We are going to enable rseq registration in glibc by default starting with glibc-2.34-37.el9 (already built in Koji). To facilitate integration with other rseq users, we have backported the GLIBC_2.35 symbol set for rseq integration. Such an ABI update during a major release is not entirely unprecedented for glibc, but hasn't happened in recent times. If an application links against the symbols, RPM will generate the appropriate dependencies, forcing an upgrade to a compatible glibc version. This is unlikely to happen soon because the interface is glibc-specific and has only been added in glibc 2.35, so applications would have to be written specifically for this interface.
As a side effect, criu needs to be updated to criu-3.17-2.el9 to remain functional.
The background is that rseq is required to implement a useful version of sched_getcpu on AArch64. Previously, a system call had to be used, and with rseq, execution can remain in userspace. The benefit is less on other architectures, but still there (e.g., on x86-64, an obscure LSL instruction is replaced with a memory load). And once glibc registers its rseq area, nothing else in the process can register it again, which is why we need some interface to rseq-using applications. Rather than coming up with our own interface, we decided to just use the upstream interface. Not entirely by accident, the GLIBC_2.35 symbol set for ld.so consists precisely of the rseq symbols, so RPM dependency management should interoperate well with this change. (Our RPM version requires backporting such symbol sets in their entirety, we do not have per-symbol package versioning information like Debian in our package builds.)
Due to a kernel bug on the CentOS builders, sched_getcpu may return the wrong values during package builds. I requested a builder kernel update here:
Please upgrade c9s builders to an 8.4 or later kernel https://issues.redhat.com/browse/CS-1129
Downstream, we have plans for maintaining ABI compatibility with RHEL 9.0: backporting the symbol set there as well, without registering rseq by default.
I don't know yet if we are going to backport further symbol sets to CentOS 9 Stream. The requirement to backport full symbol sets could be quite onerous. Historically, the main driver for such backports has been to enable running applications on CentOS that have been built on other distributions. I suspect the glibc incompatibility is simply the first problem people saw, and we don't know what problems would have come next after had we eliminated the glibc obstacle. Applications for CentOS should really be built and tested on CentOS.
Thanks, Florian
On 09/06/2022 10:54, Florian Weimer wrote:
We are going to enable rseq registration in glibc by default starting with glibc-2.34-37.el9 (already built in Koji). To facilitate integration with other rseq users, we have backported the GLIBC_2.35 symbol set for rseq integration. Such an ABI update during a major release is not entirely unprecedented for glibc, but hasn't happened in recent times. If an application links against the symbols, RPM will generate the appropriate dependencies, forcing an upgrade to a compatible glibc version. This is unlikely to happen soon because the interface is glibc-specific and has only been added in glibc 2.35, so applications would have to be written specifically for this interface.
As a side effect, criu needs to be updated to criu-3.17-2.el9 to remain functional.
The background is that rseq is required to implement a useful version of sched_getcpu on AArch64. Previously, a system call had to be used, and with rseq, execution can remain in userspace. The benefit is less on other architectures, but still there (e.g., on x86-64, an obscure LSL instruction is replaced with a memory load). And once glibc registers its rseq area, nothing else in the process can register it again, which is why we need some interface to rseq-using applications. Rather than coming up with our own interface, we decided to just use the upstream interface. Not entirely by accident, the GLIBC_2.35 symbol set for ld.so consists precisely of the rseq symbols, so RPM dependency management should interoperate well with this change. (Our RPM version requires backporting such symbol sets in their entirety, we do not have per-symbol package versioning information like Debian in our package builds.)
Due to a kernel bug on the CentOS builders, sched_getcpu may return the wrong values during package builds. I requested a builder kernel update here:
Please upgrade c9s builders to an 8.4 or later kernel https://issues.redhat.com/browse/CS-1129
Downstream, we have plans for maintaining ABI compatibility with RHEL 9.0: backporting the symbol set there as well, without registering rseq by default.
I don't know yet if we are going to backport further symbol sets to CentOS 9 Stream. The requirement to backport full symbol sets could be quite onerous. Historically, the main driver for such backports has been to enable running applications on CentOS that have been built on other distributions. I suspect the glibc incompatibility is simply the first problem people saw, and we don't know what problems would have come next after had we eliminated the glibc obstacle. Applications for CentOS should really be built and tested on CentOS.
Thanks, Florian
Hi Florian,
If there is already an internal ticket for this, let's track it there, and eventually see if we also need to apply this for other architectures. Also then worth doing that on the cbs.centos.org infra (or at least checking which running kernel it has, as there is daily update on that servers fleet, but reboot are either "on demand" or "scheduled" depending on the need)
* Fabian Arrotin:
Due to a kernel bug on the CentOS builders, sched_getcpu may return the wrong values during package builds. I requested a builder kernel update here:
Please upgrade c9s builders to an 8.4 or later kernel https://issues.redhat.com/browse/CS-1129
If there is already an internal ticket for this, let's track it there, and eventually see if we also need to apply this for other architectures.
Which part? The kernel upgrade? I filed a ticket for that in Jira. I'm not aware of any internal issue tracker (although the CS project supports private issues).
I verified that the 8.4 and later kernels contain the necessary backport from upstream Linux.
Or do you mean that I should request downstream builder upgrades as well? I see a mix of 8.2 and 8.4 kernels there.
Also then worth doing that on the cbs.centos.org infra (or at least checking which running kernel it has, as there is daily update on that servers fleet, but reboot are either "on demand" or "scheduled" depending on the need)
Sadly, I have plenty of experience with issues like this, so the glibc build log contains uname and key pieces of /proc. 8-/ The last months (or even year) have been unusually quiet in this regard. Some of the information shows up for all builds in the hwinfo files.
Thanks, Florian