On Thu, 2005-09-08 at 07:32, Bryan J. Smith wrote:
The reiserfsck runs seemed to work OK so my only complaint about that part is the oddball syntax needed to actually make it fix anything.
Well, you've had good luck then.
Except that it's a full day of downtime for the service using the drive...
I'm just wondering why it is so likely to need the fsck at all (maybe 50% of my crashes when busy)
On all filesystems? Or just 50% chance that one filesystem will need a fsck?
All the busy ones. I don't think it is a problem with idle filesystems.
[ SIDE NOTE from a previous thread: Another reason to segment your filesystems is not only to "localize" any fsck, but segmentation actually _reduces_ the risk of needed a fsck because commits are more localized/contained (especially to /tmp, /var, etc...). ]
Backuppc conserves space dramatically by hard-linking all duplicates it finds (with a fairly fast hashing scheme to find them). Thus the whole archive has to be on one filesytem. I currently use a 250 gig drive and have a little over 100 gigs used (holding what would be around 900 gigs of raw data before compression and linking).
I thought it was supposed to know what was committed to disk and never leave it in an inconsistent state.
Nope. You through incorrectly.
The _only_ purpose of journaling is to _reduce_ the time it takes to make the filesystem consistent. That assumes the journaling is good and/or the journal replay/unplay works.
There is absolutely _no_ way to guarantee a commit, although full data journaling with a NVRAM board comes close.
I expect to lose data in a crash - I don't expect the system to lose track of the free/used portion of the disk with a journaling filesystem. I suppose the best solution is a better UPS, although I also try to run the backuppc filesystem in software RAID1 between an internal IDE drive and an external firewire drive and the 2.6 kernel crashes consistently in a day or less running like that. It does work well enough that I can usually add the external drive to the raid, let it sync up, then fail and remove it. It is next to impossible to copy the huge number of hardlinks any other way in a reasonable of time.
-- Les Mikesell lesmikesell@gmail.com