ACID-compliant filesystem (was:Re: [CentOS] Re: centos] 4.4 upgrade problems)

Sat Sep 9 21:57:45 UTC 2006
Lamar Owen <lowen at pari.edu>

On Saturday 09 September 2006 14:42, Les Mikesell wrote:
> On Sat, 2006-09-09 at 12:43, Lamar Owen wrote:
> > A real transactional filesystem would allow truly atomic system updates. 
> > But, of course, there are definite downsides to that.

> I think you are missing the big picture here.

No, I don't think I am.  I think you are.  The big picture is that yum is not 
atomic in its update (not yum's fault, either); lack of atomicity (in my 
case) produced a problem (I DID UPDATE python-sqlite IN THE CORRECT ORDER).  
This is, thanks to yum's role in managing the complete installed set, a 
systemic issue; the whole system needs to atomically go from one consistent 
state to another, and portions of the system in one state need to be isolated 
from those portions of the system in the other state.  Otherwise there will 
be problems; no, they are not terribly widespread; but the general case 
solution would work wonders for yum updating its own 'stuff' too.

In the general case, I'd like to issue something like:

# acidfs-begin-transaction
# yum -y update
[bunch of output]
# if yum-no-error-condition;
#    acidfs-commit
# else 
#    acidfs-rollback
# fi

RDBMS's have been doing this for decades.  I do this daily, using SQL. Note 
that, for all processes except the shell process inside the transaction, no 
changes have occurred to the filesystem; after acidfs-commit all the 
changes 'suddenly' appear (and hopefully in-core text is reloaded if 
possible); if you get to acidfs-rollback, everything reverts and no process 
is any the wiser.

> Yum is managing the 
> whole system.  

It is currently managing the whole filesystem.  But what about in-memory 
program text?  The currently loaded programs need to continue to have the 
older libs available if needed; a single 'commit' operation at the end needs 
to atomically reload program text and provide consistent library dependencies 
post-run (the only way you can do this now is shut down the system, boot 
rescue media, and yum update to a chroot (which is the system), then reboot 
the system (aka 'booting an update CD'; anaconda does this quite well)).  The 
in core text needs to be isolated from what's going on on the filesystem, and 
if an error occurs (out of space, for instance, or a locked file) an atomic 
rollback would be very nice (anaconda does not do this, though).

> Given an ACID database, would you expect to be 
> able to upgrade it to a new, potentially incompatible version of
> an ACID database with transactions in progress? 

Of course I would; a filesystem/database (hmm, let's see, similar to, but 
farther beyond what NILFS claims) would have to be guaranteed backwards 
compatible.  That's a given for a filesystem; it is unreasonable for the 
authors of a filesystem to introduce such changes and expect seamless updates 
of the filesystem code itself.

> What yum needs 
> is just some special consideration when modifying its own
> components.

Ok, let me repeat: I updated the yum components in the right order.  This is 
NOT the sqlite/python-sqlite issue.  In my case, after following the advice 
of updating python-sqlite, then sqlite, then yum and its dependencies, I got 
a system with a lot of dupes and out of sync libraries.  Lots of out of sync 
libraries; I've not finished the recovery yet, although bind at least is 
working.  If there were a 'yum rollback'  (or a 'acidfsrollback 
snapshot-prior-to-yum') then I could at least try it again.  But there isn't, 
and I know there isn't, and I know all the deal about no support, etc etc.  

But, yes, yum does need some special care with its own components; but that's 
the narrow view; I'm looking at the big picture of a possible system-wide 
rollback/commit facility for true systemic atomicity; consistency of the 
in-core text versus on-disk text (the times I've tried to open something 
with, say, firefox, when it has been open for a while, but after it has been 
yum updated and the open fails in mysterious ways are rather annoying; this 
is not unique to CentOS); isolation of the changes that are happening on-disk 
and the view from the in-core text (my firefox example; if you run a yum 
update of firefox while firefox is running, strange things are guaranteed to 
happen to your running firefox as its in-core text becomes inconsistent with 
the on-disk libs and such!); and durability of the change (once committed, 
fully committed).  

Yes, I know a reboot to special update media would fix all that; that's an 
anaconda-mediated update, and from the system's point of view it is atomic 
(you're just doing the update under general anesthesia, so to speak).  But 
online updating, and updating without rebooting, are things I am not really 
willing to give up (especially when the quarterly update is as large as it is 
usually; makes Windows XP Service Pack 2 look like you're downloading a small 
text file!)

Just throwing an idea out, that's all, for discussion.  This systemic 
non-atomicity and inconsistency is endemic to all linuxen at the moment.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu