[CentOS-devel] Fwd: Using probe dependencies for ordering

Mon Mar 30 16:55:55 UTC 2009
Jeff Johnson <n3npq at mac.com>

Niels: I've been assembling the necessary implementations
to __REALLY__ solve the useradd ordering issue for the last few years.
Thanks for the impetus to finally attempt to deploy a solution.

Flaps down, headed on final approach pattern to the release runway ...

Gonna take a few months still ...

73 de Jeff

Begin forwarded message:

> From: Jeff Johnson <n3npq at mac.com>
> Date: March 30, 2009 12:47:57 PM EDT
> To: rpm-devel at rpm5.org
> Subject: Using probe dependencies for ordering
>
> This thread (I chimed in late solely because using PreReq:
> never solves any problem in reliable packaging imho)
>
>    http://lists.centos.org/pipermail/centos-devel/2009-March/004250.html
>
> leads me to attempting to ensure that RPM orders packages
> so that dynamic events (in this case adding a user) occur
> before they are needed.
>
> There's a whole class of these run-time ordering issues, from
> daemon restart to cache rebuilds, that are often not ordered
> correctly, leading to endless discussions of the best way
> to solve rather mundane issues.
>
> The problem is intrinsically run-time because an event
> (like a cache rebuild or adding a user or ...) needs
> to precede the installation/erasure of some other package.
>
> There's all sorts of solutions that have been proposed,
> some solutions work better than others ;-) The typical
> approach is to overload dependencies like
> 	Provides: user(foo)
> and then attach some semantic to the existence of a
> value in rpm dependencies. That approach can take you
> only so far because the string represents an
> intrinsically run-time condition without actually
> testing that the run-time condition is satisfied.
>
> E.g. adding a user through a %pre script represented by
> 	Provides: user(add)
> is only as good (or bad) as the execution of the
> script itself. One never knows whether the user
> was actually added, only that the user was *supposed*
> to have been added, tying the success of an end-luser
> upgrade to the quality of the packagers scripting
> abilities.
>
> However, the fundamental problem of representing a dependency
> on a dynamic event (think: adding a user) that usually occurs in the  
> middle of
> rather large upgrade transactions using static dependency
> assertions is rather confusing.
>
> I'm gonna call the (solution to the) problem a "Dynamically Ordered  
> Event"
> (or DOE for short) because I gotta call it something.
>
> There are already several DOE relations that are used for
> (partially) ordering packages installs and erases already.
>
> The most important DOE event (now handled in @rpm5.org code) is
> 	Install before erasing a package on upgrade.
> RPM does not compute file dispositions correctly
> unless install occurs before erase. The ordering
> relation is a DOE because (obviously) the installation
> of the new instance before the old instance is removed
> is dynamic and solely within the scope of running a transaction.
>
> Other DOE events (that were added to RPM5 in February) include
> ensuring that a package that contains a directory is installed
> before any file is installed in that directory.
>
> Similarly, the end-point of a symlink is used as a DOE relation
> when ordering packages to ensure that the end-point of a symlink is
> installed before the symlink itself.
>
> (aside)
> There's another _HUGELY_ important DOE condition in RPM that
> should be instantly obvious:
> 	Never erase a package instance if the new instance
> 	was not installed successfully.
> But that DOE condition is handled through other means (i.e.
> doubly linked package upgrade chains) than by permuting the
> order of package installation and erasure.
>
> (another aside)
> Whether DOE relations (like parentdir/linkto dependencies)
> are _NECESSARY_ or _USEFUL_ for installing *.rpm packages is quite
> controversial years after implementation. Adding DOE relations is
> demonstrably not _NECESSARY_, RPM has survived (and @rpm.org ;-)
> without using that information for more than a decade, there
> are other means to constrain ordering with existing dependency  
> relations.
> I do claim that the the added DOE relations are _USEFUL_ because
> of the increased determinism in transactions, with less "partial" in  
> the
> package ordering. Additionally, the data in the added DOE
> relations is *guaranteed* to be accurate because hierarchical
> file paths and symlink end-points always exist, no fuss, no muss,
> no packaging policies or reviews are ever needed.
>
> So I believe its time to start using DOE events (like adding
> users, restarting daemons, rebuilding caches, etc) to
> package ordering much like parentdir/linkto and install-before-erase
> are currently being used.
>
> The major problem with treating useradd like parentdir dependencies
> is how to detect the condition while installing. Unlike parentdir
> dependencies, useradd is run by a script that can/will fail.
>
> Which brings me to run-time probe dependencies. As implemented,
> run-time probe dependencies are strings that map to a dynamic run-time
> test. E.g. (as implemented @rpm5.org, ymmv) this dependency
> 	Requires: user(foo)
> tests that getpwnam("foo") has a return code of 0 rather than whether
> some package some where contains a matching string
> 	Provides: user(foo)
> (Note: I'm deliberately ignoring details that both the strcmp(3) and
> the getpwnam(3) probe will be done if necessary, and also that  
> probes are not currently
> run against added packages but only against the "system" whatever  
> that means).
>
> The important point is that (as implemented @rpm5.org) the user(foo)
> probe tests the condition that indeed, "foo" can be looked up rather
> than that some package happens to contain a matching Provides: of a  
> string.
>
> The remaining pieces of the puzzle have to do with establishing
> the points in time where DOE relations are evaluated. At the
> point where packages are ordered, the script containing the
> useradd has not yet been run (and may fail when it actually is
> run). So there's a need to attach a post-condition to the
> running of %pre (where useradd is typically done) to verify
> that indeed a user was added. There's also a need to test
> the precondition when installing a package that needs to do
> chown(2) on some file path with the uid returned by getpwnam(3)
> (i.e. exactly what the Requires: user(foo) dependency tests).
>
> The above pins down the exact points in time, using pre-/post- 
> conditions
> where the probe is meaningful.
>
> What remains is to attempt to use the DOE relation while ordering
> packages. As currently implemented, that basically means that
> %pre scripts need to be parsed to detect that useradd is being
> attempted, with whatever user, and a DOE relation needs to be
> added before the user is actually added so that the package
> that is attempting to add the user is installed before other
> packages that need to do getpwnam(3) in order to do chown(2).
>
> I think I've covered all the necessary details to attempt a
> DOE relation for adding users. Adding groups is not significantly
> different.
>
> I also claim that other probes, like /sbin/ldconfig cache rebuilds,
> can be mapped into DOE relations. Note that the code base @rpm5.org
> already has the run-time probe
> 	Requires: soname(libfoo.so)
> and there are further run-time probes that can be evaluated as
> pre-/post-conditions during package installation/erasure as well
> as being used as DOE relations for ordering.
>
> Opinions?
>
> 73 de Jeff
>
>
>
>