Niels: I've been assembling the necessary implementations to __REALLY__ solve the useradd ordering issue for the last few years. Thanks for the impetus to finally attempt to deploy a solution.
Flaps down, headed on final approach pattern to the release runway ...
Gonna take a few months still ...
73 de Jeff
Begin forwarded message:
From: Jeff Johnson n3npq@mac.com Date: March 30, 2009 12:47:57 PM EDT To: rpm-devel@rpm5.org Subject: Using probe dependencies for ordering
This thread (I chimed in late solely because using PreReq: never solves any problem in reliable packaging imho)
http://lists.centos.org/pipermail/centos-devel/2009-March/004250.html
leads me to attempting to ensure that RPM orders packages so that dynamic events (in this case adding a user) occur before they are needed.
There's a whole class of these run-time ordering issues, from daemon restart to cache rebuilds, that are often not ordered correctly, leading to endless discussions of the best way to solve rather mundane issues.
The problem is intrinsically run-time because an event (like a cache rebuild or adding a user or ...) needs to precede the installation/erasure of some other package.
There's all sorts of solutions that have been proposed, some solutions work better than others ;-) The typical approach is to overload dependencies like Provides: user(foo) and then attach some semantic to the existence of a value in rpm dependencies. That approach can take you only so far because the string represents an intrinsically run-time condition without actually testing that the run-time condition is satisfied.
E.g. adding a user through a %pre script represented by Provides: user(add) is only as good (or bad) as the execution of the script itself. One never knows whether the user was actually added, only that the user was *supposed* to have been added, tying the success of an end-luser upgrade to the quality of the packagers scripting abilities.
However, the fundamental problem of representing a dependency on a dynamic event (think: adding a user) that usually occurs in the middle of rather large upgrade transactions using static dependency assertions is rather confusing.
I'm gonna call the (solution to the) problem a "Dynamically Ordered Event" (or DOE for short) because I gotta call it something.
There are already several DOE relations that are used for (partially) ordering packages installs and erases already.
The most important DOE event (now handled in @rpm5.org code) is Install before erasing a package on upgrade. RPM does not compute file dispositions correctly unless install occurs before erase. The ordering relation is a DOE because (obviously) the installation of the new instance before the old instance is removed is dynamic and solely within the scope of running a transaction.
Other DOE events (that were added to RPM5 in February) include ensuring that a package that contains a directory is installed before any file is installed in that directory.
Similarly, the end-point of a symlink is used as a DOE relation when ordering packages to ensure that the end-point of a symlink is installed before the symlink itself.
(aside) There's another _HUGELY_ important DOE condition in RPM that should be instantly obvious: Never erase a package instance if the new instance was not installed successfully. But that DOE condition is handled through other means (i.e. doubly linked package upgrade chains) than by permuting the order of package installation and erasure.
(another aside) Whether DOE relations (like parentdir/linkto dependencies) are _NECESSARY_ or _USEFUL_ for installing *.rpm packages is quite controversial years after implementation. Adding DOE relations is demonstrably not _NECESSARY_, RPM has survived (and @rpm.org ;-) without using that information for more than a decade, there are other means to constrain ordering with existing dependency relations. I do claim that the the added DOE relations are _USEFUL_ because of the increased determinism in transactions, with less "partial" in the package ordering. Additionally, the data in the added DOE relations is *guaranteed* to be accurate because hierarchical file paths and symlink end-points always exist, no fuss, no muss, no packaging policies or reviews are ever needed.
So I believe its time to start using DOE events (like adding users, restarting daemons, rebuilding caches, etc) to package ordering much like parentdir/linkto and install-before-erase are currently being used.
The major problem with treating useradd like parentdir dependencies is how to detect the condition while installing. Unlike parentdir dependencies, useradd is run by a script that can/will fail.
Which brings me to run-time probe dependencies. As implemented, run-time probe dependencies are strings that map to a dynamic run-time test. E.g. (as implemented @rpm5.org, ymmv) this dependency Requires: user(foo) tests that getpwnam("foo") has a return code of 0 rather than whether some package some where contains a matching string Provides: user(foo) (Note: I'm deliberately ignoring details that both the strcmp(3) and the getpwnam(3) probe will be done if necessary, and also that probes are not currently run against added packages but only against the "system" whatever that means).
The important point is that (as implemented @rpm5.org) the user(foo) probe tests the condition that indeed, "foo" can be looked up rather than that some package happens to contain a matching Provides: of a string.
The remaining pieces of the puzzle have to do with establishing the points in time where DOE relations are evaluated. At the point where packages are ordered, the script containing the useradd has not yet been run (and may fail when it actually is run). So there's a need to attach a post-condition to the running of %pre (where useradd is typically done) to verify that indeed a user was added. There's also a need to test the precondition when installing a package that needs to do chown(2) on some file path with the uid returned by getpwnam(3) (i.e. exactly what the Requires: user(foo) dependency tests).
The above pins down the exact points in time, using pre-/post- conditions where the probe is meaningful.
What remains is to attempt to use the DOE relation while ordering packages. As currently implemented, that basically means that %pre scripts need to be parsed to detect that useradd is being attempted, with whatever user, and a DOE relation needs to be added before the user is actually added so that the package that is attempting to add the user is installed before other packages that need to do getpwnam(3) in order to do chown(2).
I think I've covered all the necessary details to attempt a DOE relation for adding users. Adding groups is not significantly different.
I also claim that other probes, like /sbin/ldconfig cache rebuilds, can be mapped into DOE relations. Note that the code base @rpm5.org already has the run-time probe Requires: soname(libfoo.so) and there are further run-time probes that can be evaluated as pre-/post-conditions during package installation/erasure as well as being used as DOE relations for ordering.
Opinions?
73 de Jeff
Jeff Johnson wrote:
Niels: I've been assembling the necessary implementations to __REALLY__ solve the useradd ordering issue for the last few years. Thanks for the impetus to finally attempt to deploy a solution.
Thanks for sending the details :) Niels
Flaps down, headed on final approach pattern to the release runway ...
Gonna take a few months still ...
73 de Jeff
Begin forwarded message:
From: Jeff Johnson n3npq@mac.com Date: March 30, 2009 12:47:57 PM EDT To: rpm-devel@rpm5.org Subject: Using probe dependencies for ordering
This thread (I chimed in late solely because using PreReq: never solves any problem in reliable packaging imho)
http://lists.centos.org/pipermail/centos-devel/2009-March/004250.html
leads me to attempting to ensure that RPM orders packages so that dynamic events (in this case adding a user) occur before they are needed.
There's a whole class of these run-time ordering issues, from daemon restart to cache rebuilds, that are often not ordered correctly, leading to endless discussions of the best way to solve rather mundane issues.
The problem is intrinsically run-time because an event (like a cache rebuild or adding a user or ...) needs to precede the installation/erasure of some other package.
There's all sorts of solutions that have been proposed, some solutions work better than others ;-) The typical approach is to overload dependencies like Provides: user(foo) and then attach some semantic to the existence of a value in rpm dependencies. That approach can take you only so far because the string represents an intrinsically run-time condition without actually testing that the run-time condition is satisfied.
E.g. adding a user through a %pre script represented by Provides: user(add) is only as good (or bad) as the execution of the script itself. One never knows whether the user was actually added, only that the user was *supposed* to have been added, tying the success of an end-luser upgrade to the quality of the packagers scripting abilities.
However, the fundamental problem of representing a dependency on a dynamic event (think: adding a user) that usually occurs in the middle of rather large upgrade transactions using static dependency assertions is rather confusing.
I'm gonna call the (solution to the) problem a "Dynamically Ordered Event" (or DOE for short) because I gotta call it something.
There are already several DOE relations that are used for (partially) ordering packages installs and erases already.
The most important DOE event (now handled in @rpm5.org code) is Install before erasing a package on upgrade. RPM does not compute file dispositions correctly unless install occurs before erase. The ordering relation is a DOE because (obviously) the installation of the new instance before the old instance is removed is dynamic and solely within the scope of running a transaction.
Other DOE events (that were added to RPM5 in February) include ensuring that a package that contains a directory is installed before any file is installed in that directory.
Similarly, the end-point of a symlink is used as a DOE relation when ordering packages to ensure that the end-point of a symlink is installed before the symlink itself.
(aside) There's another _HUGELY_ important DOE condition in RPM that should be instantly obvious: Never erase a package instance if the new instance was not installed successfully. But that DOE condition is handled through other means (i.e. doubly linked package upgrade chains) than by permuting the order of package installation and erasure.
(another aside) Whether DOE relations (like parentdir/linkto dependencies) are _NECESSARY_ or _USEFUL_ for installing *.rpm packages is quite controversial years after implementation. Adding DOE relations is demonstrably not _NECESSARY_, RPM has survived (and @rpm.org ;-) without using that information for more than a decade, there are other means to constrain ordering with existing dependency relations. I do claim that the the added DOE relations are _USEFUL_ because of the increased determinism in transactions, with less "partial" in the package ordering. Additionally, the data in the added DOE relations is *guaranteed* to be accurate because hierarchical file paths and symlink end-points always exist, no fuss, no muss, no packaging policies or reviews are ever needed.
So I believe its time to start using DOE events (like adding users, restarting daemons, rebuilding caches, etc) to package ordering much like parentdir/linkto and install-before-erase are currently being used.
The major problem with treating useradd like parentdir dependencies is how to detect the condition while installing. Unlike parentdir dependencies, useradd is run by a script that can/will fail.
Which brings me to run-time probe dependencies. As implemented, run-time probe dependencies are strings that map to a dynamic run-time test. E.g. (as implemented @rpm5.org, ymmv) this dependency Requires: user(foo) tests that getpwnam("foo") has a return code of 0 rather than whether some package some where contains a matching string Provides: user(foo) (Note: I'm deliberately ignoring details that both the strcmp(3) and the getpwnam(3) probe will be done if necessary, and also that probes are not currently run against added packages but only against the "system" whatever that means).
The important point is that (as implemented @rpm5.org) the user(foo) probe tests the condition that indeed, "foo" can be looked up rather than that some package happens to contain a matching Provides: of a string.
The remaining pieces of the puzzle have to do with establishing the points in time where DOE relations are evaluated. At the point where packages are ordered, the script containing the useradd has not yet been run (and may fail when it actually is run). So there's a need to attach a post-condition to the running of %pre (where useradd is typically done) to verify that indeed a user was added. There's also a need to test the precondition when installing a package that needs to do chown(2) on some file path with the uid returned by getpwnam(3) (i.e. exactly what the Requires: user(foo) dependency tests).
The above pins down the exact points in time, using pre-/post- conditions where the probe is meaningful.
What remains is to attempt to use the DOE relation while ordering packages. As currently implemented, that basically means that %pre scripts need to be parsed to detect that useradd is being attempted, with whatever user, and a DOE relation needs to be added before the user is actually added so that the package that is attempting to add the user is installed before other packages that need to do getpwnam(3) in order to do chown(2).
I think I've covered all the necessary details to attempt a DOE relation for adding users. Adding groups is not significantly different.
I also claim that other probes, like /sbin/ldconfig cache rebuilds, can be mapped into DOE relations. Note that the code base @rpm5.org already has the run-time probe Requires: soname(libfoo.so) and there are further run-time probes that can be evaluated as pre-/post-conditions during package installation/erasure as well as being used as DOE relations for ordering.
Opinions?
73 de Jeff
CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel