[CentOS] Disk De-Fraging in Linux
loony at loonybin.org
Fri Sep 21 05:48:07 UTC 2007
On Thursday 20 September 2007, Al Sparks wrote:
> Why? What's different between NTFS and ext2/3 that defragging is
> needed in one but not the other?
> === Al
And this is the right question to ask...
Anyway - the answer about defragging, if you really care to understand it, is
FAT used to be horrible. It would always simply take the first available
cluster and use that to store data. This resulted in a lot of fragmentation.
NTFS is much better already. It still profits from defragging but the results
don't make that much of a difference anymore as long as your partition
doesn't get close to being full. It tries to allocate contiguous blocks and
will even add some buffer to the end for file growth.
ext2/3 is similar to ntfs in its fragmentation resistance. It however, has 2
more advantages. First, linux uses swap devices and stuff like mmapped files
are still movable. In windows, swap files and some other files are not
movable. The second advantage is reserved space. By default, each ext2/3
filesystem has 5% of its space reserved for root. ext2/3 simply assume you
will never get past 95% full - so the data is laid out accordingly. Since you
know you have at least 5% free disk blocks, you can leave a little bit more
unallocated space at the end of each file... Its not much but it adds up over
The worst possible scenario I've found for ext3 so far is cvs. With every
checkin, cvs has to modify the whole file. It does so by writing a completely
new file, then deleting the old one and moving the new file in place. This
means that each time, the filesystem has to allocate new space.
For a long time, I balanced stuff between servers, removed outdated code and
so on. bi-monthly fsck would show about 1-2% fragmentation at about 75%
filesystem full. Then a few large projects were imported. filesystem usage
went up to 98% (someone did a tune2fs -m 0) and then the problems really
started. I'm just about to go home now - 2am. I spent the last few hours
reorganizing the cvs filesystem. A filesystem check showed 61% fragmentation!
I moved old code off to a secondary server, then coppied things off, recreated
the filesystem and then copied the data back.
Results were impressive - my I/O subsystem can take about 1800 io ops per
second. The result before that, was about 1.1MB/sec throughput measured in
iostat with a few cvs processes running at the same time.
After the reorg... again 1800 ios - but my throughput rose to a more useful 24
Anyway - bullet points:
* there is no good way to measure (on a filesystem level) fragmentation other
* try filefrag to check for fragmentation on a per file basis.
* there is no online ext2/3 defragger that works on block level
* there is a offline defragger for ext2 on block level e2defrag. ext3 would
have to be converted to ext2 and back to ext3 after the defrag.
* there are some filelevel fragmentation tools. They basically work by copying
files around. This works on filesystems that had high utilization for a
while, got fragmented but are now mostly empty again. I tried some of that on
my cvs server but none ended up giving me good results.
* if fsck shows high fragmentation (>5% in my opinions) you should make sure
the filesystem doesn't get that full and if you really want to defrag, copy
the data off and back on. Its the best way to do it.
And now I'm off to bed.
More information about the CentOS