[CentOS] How to stagger fsck executions

Tue Apr 21 21:12:37 UTC 2015
Warren Young <wyml at etr-usa.com>

On Apr 21, 2015, at 9:50 AM, Hugh E Cruickshank <hugh at forsoft.com> wrote:
> 
> From: Kay Diederichs Sent: April 21, 2015 03:43
>> 
>> instead of having 20 for all of them, set
>> the first filesystem to 17, the second to 19, the third to 23, and the
>> fourth to 29.
> 
> Thanks but that is not much different then my second idea and does not
> fully avoid the problem.

You may be missing a key fact of how prime numbers work.

You can only get two or more fscks on a single reboot when the mount count is a multiple of two or more of the max-mount-count values.  When those numbers are all prime, the frequency of such occurrences is much lower than when you use purely random values.

With the four values that Kay provided, I calculate a 1.2% chance on average that two or more volumes will need to be checked on the same reboot.  If you reboot on average once a month, that means it only happens once every 7 years or so.  That means the machine may well be retired before it happens even once, and then only if you reboot that often in the first place.

If you take the same set of values and add one to them to make them all even, and thus composite numbers, the chance rises to 3.3%, or about once every 2.5 years for monthly reboots.  Thus, Kay’s solution is actually more than twice as good as your second solution.

Interestingly, the numbers don’t actually have to be prime to take advantage of this property of integers.  They just have to be *relatively* prime: http://goo.gl/bQbu5Z

For example, the set {15, 19, 22, 23} isn’t all prime, but is *relatively* prime, since there is no integer other than 1 that evenly divides any pair among the set.  This set gives about a 1.5% chance of a 2+ volume collision, nearly as good as Kay’s prime set.

I’ve put the calculator I built to test this up here:

    https://gist.github.com/wyoung/7c94967bb635de48d058