[CentOS] Re: centos] 4.4 upgrade problems

Wed Sep 6 02:35:53 UTC 2006
R P Herrold <herrold at owlriver.com>

On Tue, 5 Sep 2006, Lamar Owen wrote:

> I would consider http://bugs.centos.org/view.php?id=1483 to not be minor.

Hi, Lamar -- as it is my bug, a bit of context for those who 
have not read it is in order.  -- smile -- In the seven hours 
it took me to run down the cause, I assure you, I did not 
consider it minor either.  Thus my early and formal report and 
commentary for others to find.

I would note that Johnny and I, and Seth at the end, worked on 
this both in the main #centos IRC channel, and out of channel, 
to run down hypotheses for me to test.  Thanks guys.

>  I looked over Red Hat's Bugzilla and didn't, in the few 
> minutes I skimmed, see the same issue in upstream.  It could 
> be related to yum's means of doing the package update versus 
> up2date's method; on a production DNS box I had the problem 
> mentioned in this bug, but on a machine that wasn't the 
> production name server I didn't.

No surprise that yum/sqlite issues do not affect the 
upstream, as their approach on the updater varies.

This bug hinges, very much, on the non-atomic nature of 'hot' 
system updates, and the fact that the yum-needed, 
sqlite-maintinaed cache of pacakges got munged half way 
through, to reproduce.

It is 'luck of the draw' as there are no relevant Requires in 
play, in the transaction sort as to whether the bind-libs and 
bind update fall on the same side of the update failure -- so 
long as they are NOT on differing sides, there is no problem; 
When they varied, not surprisingly, bind gets confused.  ;0

> I reproduced the issue using the proper yum sequence, 
> updating python-sqlite, then sqlite, the updating yum, then 
> doing a clean all, and had the problem.

and, in my post-analysis, it looks like there a pretty strong 
liklihood that this approach is great for over 90% of the 
boxes out there.  Boxes with 'tight' partitioning, or packages 
held back (exclude=) from updates are a bit more likely to 
need two or more passes, and so expose themselves more 
frequently to the sequencing risk, where any failure needs 
manual intervention, to recover from.

-- Russ Herrold